BriefGPT.xyz
May, 2023
多语言自监督和弱监督语音预训练与适应未见语言的比较
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
HTML
PDF
Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogerio Feris, Leonid Karlinsky...
TL;DR
本文研究了两个多语言语音模型在适应未见语言上的性能比较,发现模型的预训练数据中包含的语言家族数量和训练时长能预测模型的表现,与预训练方法的差异不相关。
Abstract
Recent models such as XLS-R and Whisper have made
multilingual speech technologies
more accessible by
pre-training
on audio from around 100 spoken languages each. However, there are thousands of spoken languages
→