BriefGPT.xyz
Oct, 2022
如何实现平衡高效的多语言模型: 既保护用户数据,又保持模型性能
You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models
HTML
PDF
Tomasz Limisiewicz, Dan Malkin, Gabriel Stanovsky
TL;DR
本文提出了一种基于教师-学生知识蒸馏的新型多语种训练技术,利用平衡(子采样)数据将单语教师模型的知识蒸馏到一个多语种学生中,可以提高自然语言处理系统中低资源语言的表现。
Abstract
multilingual models
have been widely used for cross-lingual transfer to
low-resource languages
. However, the performance on these languages is hindered by their underrepresentation in the pretraining data. To all
→