BriefGPT.xyz
Nov, 2023
高效地将预先训练好的语言模型适应新语言
Efficiently Adapting Pretrained Language Models To New Languages
HTML
PDF
Zoltan Csaki, Pian Pawakapan, Urmish Thakker, Qiantong Xu
TL;DR
本文研究如何有效地将任何现有的预训练大型语言模型适应到新的语言中,避免灾难性遗忘和标记器效率低下的问题,并通过添加目标语言的新标记和研究数据混合配方提高标记器的编码效率。实验证明,我们的配方在将英语预训练大型语言模型适应到匈牙利语和泰语方面,能够达到比开源模型更好的性能,同时对英语的回归影响很小。
Abstract
Recent
large language models
(LLM) exhibit sub-optimal performance on
low-resource languages
, as the training data of these models is usually dominated by English and other high-resource languages. Furthermore, i
→