NEURIPS 2025 "language model pre-training" Papers
2 papers found
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Thomson Yen, Andrew Siah, Haozhe Chen et al.
NEURIPS 2025posterarXiv:2503.21023
1
citations
Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao, Yu Yang, Yonggan Fu et al.
NEURIPS 2025spotlightarXiv:2504.13161
19
citations