"language model pretraining" Papers
3 papers found
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
Ping Guo, Yubing Ren, BINBINLIU et al.
NeurIPS 2025posterarXiv:2509.15556
1
citations
Group-Level Data Selection for Efficient Pretraining
Zichun Yu, Fei Peng, Jie Lei et al.
NeurIPS 2025posterarXiv:2502.14709
1
citations
Multi-Token Prediction Needs Registers
Anastasios Gerontopoulos, Spyridon Gidaris, Nikos Komodakis
NeurIPS 2025posterarXiv:2505.10518
4
citations