自动语音识别基准测试：对更具代表性的对话数据集的需求

Sep, 2024

自动语音识别基准测试：对更具代表性的对话数据集的需求

ASR Benchmarking: Need for a More Representative Conversational Dataset

Gaurav Maheshwari, Dmitry Ivanov, Théo Johannet, Kevin El Haddad

TL;DR本研究解决了现有自动语音识别（ASR）基准未能反映现实对话环境复杂性的问题，提出了一个来自TalkBank的多语言对话数据集。研究发现，主流ASR模型在此对话环境下性能显著下降，并揭示了语音不流畅性与词错误率之间的相关性，突显了建立更真实对话基准的必要性。

Abstract

Automatic Speech Recognition (ASR) systems have achieved remarkable performance on widely used benchmarks such as LibriSpeech and Fleurs. However, these benchmarks do not adequately reflect the complexities of real-world conversational environments, where speech is often unstructured a