"iterative model improvement" Papers
2 papers found
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
Zaid Khan, Elias Stengel-Eskin, Jaemin Cho et al.
ICLR 2025posterarXiv:2410.06215
8
citations
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
Ziyu Wan, Xidong Feng, Muning Wen et al.
ICML 2024poster