零對應跨模態轉換的模塊化語音轉文本翻譯

Oct, 2023

零對應跨模態轉換的模塊化語音轉文本翻譯

Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer

Paul-Ambroise Duquenne, Holger Schwenk, Benoît Sagot

TL;DR通过独立训练的编码器和解码器，通过共享的固定大小表示组合，可以在语音到文本翻译中取得竞争力的性能，本研究表明这种方法可以通过多语种训练进一步改进，我们观察到在零-shot跨模态语音翻译中显著提高，甚至在几种语言上胜过基于XLSR的有监督方法。

Abstract

Recent research has shown that independently trained encoders and decoders, combined through a shared fixed-size representation, can achieve competitive performance in speech-to-text translation. In this work, we