BriefGPT.xyz
Mar, 2022
KinyaBERT:一个考虑形态的基尼亚隆达语言模型
KinyaBERT: a Morphology-aware Kinyarwanda Language Model
HTML
PDF
Antoine Nzeyimana, Andre Niyongabo Rubungo
TL;DR
提出了一种两层BERT架构,利用形态分析器和显式表示形态构成,解决了BERT模型在处理形态丰富的语言时效率低下的问题,并将所提出的模型在低资源形态丰富的Kinyarwanda语言上进行了评估。结果表明,所提出的模型KinyaBERT在命名实体识别任务和机器翻译评估指标上均优于其他基线模型。
Abstract
Pre-trained language models such as
bert
have been successful at tackling many natural language processing tasks. However, the unsupervised
sub-word tokenization
methods commonly used in these models (e.g., byte-
→