As Automatic Speech Recognition (ASR) models become ever more pervasive, it is important to ensure that they make reliable predictions under corruptions present in the physical and digital world. We propose Speech Robust Bench (SRB), a comprehensive benchmark for evaluating the robustness of ASR models to diverse corruptions. SRB is composed of 69 input perturbations which are intended to simulate various corruptions that ASR models may encounter in the physical and digital world. We use SRB to evaluate the robustness of several state-of-the-art ASR models and observe that model size and certain modeling choices such as discrete representations, and self-training appear to be conducive to robustness. We extend this analysis to measure the robustness of ASR models on data from various demographic subgroups, namely English and Spanish speakers, and males and females, and observed noticeable disparities in the model's robustness across subgroups. We believe that SRB will facilitate future research towards robust ASR models, by making it easier to conduct comprehensive and comparable robustness evaluations.

自动语音识别模型在物理世界和数字世界中的多样化损坏下进行可靠预测的鲁棒性评估是至关重要的。我们提出了一个全面的基准测试工具Speech Robust Bench (SRB)来评估ASR模型在各种损坏条件下的鲁棒性。通过使用SRB评估顶尖ASR模型的鲁棒性，我们观察到模型大小、离散表示以及自我训练等建模选择对于鲁棒性有积极影响。我们还进一步测量ASR模型在不同社会群体（如英语和西班牙语使用者，男性和女性）的数据上的鲁棒性，并观察到鲁棒性在不同群体之间存在明显差异。我们相信SRB将有助于推动鲁棒ASR模型的未来研究，使全面且可比较的鲁棒性评估更加容易进行。

语音鲁棒性基准测评：一个语音识别鲁棒性评估基准