BriefGPT.xyz
May, 2024
针对低资源语言家族的定向多语言适应
Targeted Multilingual Adaptation for Low-resource Language Families
HTML
PDF
C. M. Downey, Terra Blevins, Dhwani Serai, Dwija Parikh, Shane Steinert-Threlkeld
TL;DR
对于低资源语言,通过针对性的多语言训练,依照乌拉尔语系为案例进行调整,通过实验证明适应性的词汇大小对于低资源语言的影响相对较小,低资源语言在训练阶段能够进行积极采样而对高资源语言的性能影响微乎其微,从而为特定语境中的语言适应性提供了新的最佳实践。
Abstract
The "massively-multilingual" training of multilingual models is known to limit their utility in any one language, and they perform particularly poorly on
low-resource languages
. However, there is evidence that
low-resou
→