A Curious Case of the Missing Measure: Better Scores and Worse Generation

0citations
0
Citations
#2275
in ICLR 2025
of 3827 papers
2
Authors
4
Data Points

Abstract

Our field has a secret: nobody fully trusts audio evaluation measures. As neural audio generation nears perceptual fidelity, these measures fail to detect subtle differences that human listeners readily identify, often contradicting each other when comparing state-of-the-art models. The gap between human perception and automatic measures means we have increasingly sophisticated models while losing our ability to understand their flaws.

Citation History

Jan 26, 2026
0
Jan 27, 2026
0
Jan 27, 2026
0
Jan 31, 2026
0