BriefGPT.xyz
Dec, 2023
通过联合建模主要和非主要发言者改善长篇语音识别
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers
HTML
PDF
Guru Prakash Arumugam, Shuo-yiin Chang, Tara N. Sainath, Rohit Prabhavalkar, Quan Wang...
TL;DR
ASR模型经常在转录长时间音频时出现长篇删除问题,本研究通过引入新的技术,在音频中同时建模不同组的演讲者和标准转录标记,减轻了长篇删除问题。
Abstract
asr models
often suffer from a
long-form deletion
problem where the model predicts sequential blanks instead of words when transcribing a lengthy
→