探索音素级别的端到端语音翻译表示

Jun, 2019

探索音素级别的端到端语音翻译表示

Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation

Elizabeth Salesky, Matthias Sperber, Alan W Black

TL;DR研究采用类音素语音表示替代传统帧级语音特征作为源输入，在端到端的语音翻译任务中，相比传统方法，模型性能有显著提高，同时训练时间减少了60%。

Abstract

Previous work on end-to-end translation from speech has primarily used frame-level features as speech representations, which creates longer, sparser sequences than text. We show that a naive method to create comp