BriefGPT.xyz
May, 2022
多语言语言模型的单/跨语言预训练动态分析
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models
HTML
PDF
Terra Blevins, Hila Gonen, Luke Zettlemoyer
TL;DR
本研究旨在探究跨语言预训练模型的学习过程,发现该模型在语言内表现出较高的性能,复杂任务在低级语言技能前学习。添加不同的语言对跨语言转移的学习时机不同,并且最终模型层表现存在时间衰减现象,语言知识向网络底层传递。
Abstract
The emergent
cross-lingual transfer
seen in
multilingual pretrained models
has sparked significant interest in studying their behavior. However, because these analyses have focused on fully trained multilingual m
→