"benchmark generation" Papers
3 papers found
Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time-Series Forecasting Based on Biological ODEs
Christian Klötergens, Vijaya Krishna Yalavarthi, Randolf Scholz et al.
ICLR 2025posterarXiv:2502.07489
2
citations
Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator
Peiwen Yuan, Yiwei Li, Shaoxiong Feng et al.
NeurIPS 2025posterarXiv:2505.20738
3
citations
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Alex Gu, Baptiste Roziere, Hugh Leather et al.
ICML 2024poster