BriefGPT.xyz
Jan, 2025
基于自监督学习的自动语音识别中的语言偏差
Language Bias in Self-Supervised Learning For Automatic Speech Recognition
HTML
PDF
Edward Storey, Naomi Harte, Peter Bell
TL;DR
本研究聚焦于多语言自监督学习自动语音识别中的语言偏差问题,并发现现有模型在多种语言的表现存在不均衡。通过利用彩票票据假说的方法,识别出语言特定的子网络并测试其性能,揭示了模型在微调过程中偏向于使用来自数据量较大的语言的权重,而忽略传统语言知识的构建。
Abstract
Self-Supervised Learning
(SSL) is used in deep learning to train on large datasets without the need for expensive labelling of the data. Recently, large
Automatic Speech Recognition
(ASR) models such as XLS-R hav
→