Most Cited 2025 by Thomas Foster Papers
3 papers found
Conference
#1
Measuring what Matters: Construct Validity in Large Language Model Benchmarks
Andrew M. Bean, Ryan Othniel Kearns, Angelika Romanou et al.
NEURIPS 2025posterarXiv:2511.04703
9
citations
#2
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements
Bingchen Zhao, Despoina Magka, Minqi Jiang et al.
NEURIPS 2025posterarXiv:2506.22419
2
citations
#3
LILO: Learning to Reason at the Frontier of Learnability
Thomas Foster, Anya Sims, Johannes Forkel et al.
NEURIPS 2025poster