BriefGPT.xyz
Jan, 2021
高效地融合预训练的声学和语言编码器用于低资源语音识别
Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
HTML
PDF
Cheng Yi, Shiyu Zhou, Bo Xu
TL;DR
该论文研究了如何将预训练声学编码器和预训练语言编码器融合到端到端自动语音识别模型中,以提高模型的性能,尤其是在低资源自动语音识别的情境下。实验证明,该方法比其他端到端模型在15小时的CALLHOME语料库上表现得更好。
Abstract
self-supervised acoustic pre-training
has achieved impressive results on
low-resource
speech recognition tasks. It indicates that the pretrain-and-finetune paradigm is a promising direction. In this work, we prop
→