"automated question generation" Papers
2 papers found
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
Ezra Karger, Houtan Bastani, Chen Yueh-Han et al.
ICLR 2025posterarXiv:2409.19839
31
citations
STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models
Narun Raman, Taylor Lundy, Thiago Amin et al.
NeurIPS 2025posterarXiv:2502.13119
3
citations