TL;DR本研究旨在提出一种新的数据增强方法来改善自动语音识别模型,该方法生成合成文本和合成音频,使用该方法可以提高 Quechua 语言的 ASR 模型的词错误率(WER)达到 8.73%的改善。
Abstract
Nowadays, the main problem of deep learning techniques used in the
development of automatic speech recognition (ASR) models is the lack of
transcribed data. The goal of this research is to propose a new data
augm