Corpora for African languages - An Crúbadán
Posted by sociolingo on April 1, 2008
Source: Aflat
Corpora for African languages - An Crúbadán
Thu, 2008-02-07 04:42 — scannell
Description:
The Crúbadán Project is devoted to creating basic language technology for minority languages and under-resourced languages using web-crawling and statistical techniques. As of early 2008 we have collected text corpora for 419 languages, including more than 125 African languages, and have used these to create open source spell checkers for more than 20 languages. Please contact Kevin Scannell (http://borel.slu.edu/) if you are interested in developing open source resources for other African languages using these data.