TL;DR本文探讨了预训练语音模型在 E2E-ASR 中的潜在应用,发现在一些 ASR benchmark corpora 上,使用预训练模型能够超越当前最先进的识别性能。其中,HuBERT 模型表现尤为突出,实验代码和模型参数已开源。
Abstract
self-supervised pretraining on speech data has achieved a lot of progress. High-fidelity representation of the speech signal is learned from a lot of untranscribed data and shows promising performance. Recently, there are several works focusing on evaluating the quality of self-supervi