BriefGPT.xyz
Mar, 2023
面向低资源语言的数据高效语音合成无监督预训练
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource Languages
HTML
PDF
Seongyeon Park, Myungseo Song, Bohyung Kim, Tae-Hyun Oh
TL;DR
本文提出了一种基于无监督预训练的神经文本朗读生成模型, 通过学习Warped Mel-Spectrogram的重构来优化时序关系,进一步提高数据利用效率,在低资源语言情境下实现了显著的性能提升。
Abstract
neural text-to-speech
(TTS) models can synthesize natural human speech when trained on large amounts of transcribed speech. However, collecting such large-scale transcribed data is expensive. This paper proposes an
unsu
→