ICLR 2025 "benchmark design" Papers
2 papers found
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Clemencia Siro, Guy Gur-Ari, Gaurav Mishra et al.
ICLR 2025oralarXiv:2206.04615
2192
citations
Commit0: Library Generation from Scratch
Wenting Zhao, Nan Jiang, Celine Lee et al.
ICLR 2025posterarXiv:2412.01769
18
citations