听、懂、翻译：三重监督解耦端到端语音翻译

Sep, 2020

听、懂、翻译：三重监督解耦端到端语音翻译

TED: Triple Supervision Decouples End-to-end Speech-to-text Translation

Qianqian Dong, Mingxuan Wang, Hao Zhou, Shuang Xu, Bo Xu...

TL;DR本文提出Listen-Understand-Translate (LUT)统一框架，利用三重信号指导对端到端语音到文本翻译任务进行解耦，成功地应用于各种语音翻译基准测试中，实现了最先进的性能，优于以前的方法。

Abstract

An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs the text in a target language. Inspired by neuroscience, humans have perception systems and cognitive systems to process different information, we propose TED, \textbf{T}ransducer-\textbf{E}ncode