BriefGPT.xyz
May, 2022
Wav2Seq:使用伪语言预训练语音到文本编解码模型
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
HTML
PDF
Felix Wu, Kwangyoun Kim, Shinji Watanabe, Kyu Han, Ryan McDonald...
TL;DR
Wav2Seq是第一个用于预训练语音数据的自监督方法,采用了伪语言作为紧凑的离散表示,并制定了自监督伪语音识别任务-将音频输入转录为伪子词序列。
Abstract
We introduce
wav2seq
, the first
self-supervised
approach to pre-train both parts of
encoder-decoder models
for speech data. We induce a ps
→