使用有限领域监督训练自回归语音识别模型

Oct, 2022

使用有限领域监督训练自回归语音识别模型

Training Autoregressive Speech Recognition Models with Limited in-domain Supervision

Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover

TL;DR本文探索在有限监督数据的情况下，如何使用半监督学习和自学习相结合的自回归编码器-解码器模型处理会话性语音领域，结果表明，在在领域内数据较为有限时，通过 XLS-R 模型自伪转录，使用这种自回归模型比微调 XLS-R 模型的效果更好，可以将 WER 降低8%的绝对值。

Abstract

Advances in self-supervised learning have significantly reduced the amount of transcribed audio required for training. However, the majority of work in this area is focused on read speech. We explore limited supervision