TL;DR本研究比较了级联和端到端模型在不同资源条件下的性能,并在 ST 模型中引入电话特征以提高它们的表现,从而缩小了端到端模型与级联模型之间的差距。
Abstract
end-to-end models for speech translation (ST) more tightly couple speech
recognition (ASR) and machine translation (MT) than a traditional cascade of
separate ASR and MT models, with simpler model architectures a