BriefGPT.xyz
Nov, 2024
对齐预训练模型用于口语翻译
Aligning Pre-trained Models for Spoken Language Translation
HTML
PDF
Šimon Sedláček, Santosh Kesiraju, Alexander Polok, Jan Černocký
TL;DR
本研究解决了口语翻译(ST)中的一个关键问题,即如何有效对齐预训练的自动语音识别(ASR)和机器翻译(MT)模型。作者提出了一种小型连接模块(Q-Former)来实现这一对齐,并通过实验表明,增强的基础ASR和MT模型显著提升了翻译效果,且这一方法具有可扩展性和实用性。
Abstract
This paper investigates a novel approach to end-to-end
Speech Translation
(ST) based on aligning frozen pre-trained
Automatic Speech Recognition
(ASR) and
→