WordNet and Wortschatz Leipzig frequency lists are two excellent sources for English word datasets. WordNet provides a database of word relations including definitions and synonyms, maintained as an open-source fork. Leipzig offers frequency lists from news, web, and Wikipedia sources. Unix system word lists and Wiktionary are also useful alternatives, though the former lacks clear licensing and the latter is formatted for human rather than machine consumption. These datasets are valuable for projects requiring word lists with different criteria like common words or specific subsets.
Sort: