BriefGPT.xyz
Apr, 2024
自我监督表示在自动语音识别中的高效注入
Efficient infusion of self-supervised representations in Automatic Speech Recognition
HTML
PDF
Darshan Prabhu, Sai Ganesh Mirishkar, Pankaj Wasnik
TL;DR
我们提出了两种简单的方法,使用逐帧加法和交叉注意机制来高效地将自监督学习模型的表示纳入ASR架构,从而在训练期间避免使用自监督学习模型,加快了训练速度,并在Librispeech和Tedlium数据集上相较于基准模型实现了显著性能提升。
Abstract
self-supervised learned
(SSL) models such as Wav2vec and HuBERT yield state-of-the-art results on
speech-related tasks
. Given the effectiveness of such models, it is advantageous to use them in conventional
→