Speech Robust Bench: A Robustness Benchmark For Speech Recognition

12citations

arXiv:2403.07937

Citations

#664

in ICLR 2025

of 3827 papers

Authors

Data Points

Authors

Muhammad Shah David Solans Noguero Mikko Heikkilä Bhiksha Raj Nicolas Kourtellis

Topics

automatic speech recognition robustness benchmark input perturbations discrete representations self-training demographic disparities speech corruptions

Abstract

As Automatic Speech Recognition (ASR) models become ever more pervasive, it is important to ensure that they make reliable predictions under corruptions present in the physical and digital world. We propose Speech Robust Bench (SRB), a comprehensive benchmark for evaluating the robustness of ASR models to diverse corruptions. SRB is composed of 114 input perturbations which simulate an heterogeneous range of corruptions that ASR models may encounter when deployed in the wild. We use SRB to evaluate the robustness of several state-of-the-art ASR models and observe that model size and certain modeling choices such as the use of discrete representations, or self-training appear to be conducive to robustness. We extend this analysis to measure the robustness of ASR models on data from various demographic subgroups, namely English and Spanish speakers, and males and females. Our results revealed noticeable disparities in the model's robustness across subgroups. We believe that SRB will significantly facilitate future research towards robust ASR models, by making it easier to conduct comprehensive and comparable robustness evaluations.

Citation History

Jan 25, 2026

Jan 27, 2026

Jan 30, 2026

12+12