wikimedia/wikimedia-textcat: Mirror of https://gerrit.wikimedia.org/g/wikimedia/textcat See https://www.mediawiki.org/wiki/Developer_access for contributingRead post
wikimedia-textcat is a PHP port of the TextCat language detection utility that uses n-gram-based models to identify the language of a given text. It includes a classifier script (catus.php), a model-generation script (felis.php), and a model-conversion script (lm2php.php). The package ships with 70+ wiki-text language models
Sort: