BriefGPT.xyz
Sep, 2024
无对齐训练的基于转导器的多说话人自动语音识别
Alignment-Free Training for Transducer-based Multi-Talker ASR
HTML
PDF
Takafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara...
TL;DR
本研究解决了多说话人语音识别中对昂贵前端源分离的依赖问题。提出的MT-RNNT-AFT方法通过简化标签生成过程,实现了不依靠准确对齐的训练,并能在仅一次编码器处理的情况下识别所有说话者的语音。实验表明,该方法在性能上可与最先进的替代方案媲美,同时显著简化了训练过程。
Abstract
Extending the
RNN Transducer
(RNNT) to recognize multi-talker speech is essential for wider automatic
Speech Recognition
(ASR) applications. Multi-talker RNNT (MT-RNNT) aims to achieve recognition without relying
→