BriefGPT.xyz
Sep, 2023
长篇语音识别的更新语料库和基准
Updated Corpora and Benchmarks for Long-Form Speech Recognition
HTML
PDF
Jennifer Drexler Fox, Desh Raj, Natalie Delworth, Quinn McNamara, Corey Miller...
TL;DR
本文重新发布三个标准的ASR语料库,用于长篇ASR研究,并研究了训练与测试数据不匹配问题,通过基准测试展示了长篇训练在此领域转变下的模型鲁棒性。
Abstract
The vast majority of
asr research
uses
corpora
in which both the training and test data have been pre-segmented into
utterances
. In most r
→