"data mixture optimization" Papers
3 papers found
Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao, Yu Yang, Yonggan Fu et al.
NeurIPS 2025spotlightarXiv:2504.13161
19
citations
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu, Xiaosen Zheng, Niklas Muennighoff et al.
ICLR 2025posterarXiv:2407.01492
99
citations
Data Engineering for Scaling Language Models to 128K Context
Yao Fu, Rameswar Panda, Xinyao Niu et al.
ICML 2024poster