Aug, 2022
Kencorpus: 一个用于自然语言处理任务的肯尼亚斯瓦希里语、多鲁奥语和卢希亚语语料库
Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks
Barack Wanjawa, Lilian Wanzare, Florence Indede, Owen McOnyango, Edward Ombui...
TL;DRKencorpus, the first corpus of its kind for low-resource Indigenous African languages, endeavors to fill the gap in the development of Natural Language Processing and Machine Learning datasets for Swahili, Dholuo, and Luhya languages, enabling text and speech data-driven solutions in applications like machine translation, question-answering, and transcription.