Spotlight "language model training" Papers
4 papers found
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary Charles, Gabriel Teston, Lucio Dery et al.
NeurIPS 2025spotlightarXiv:2503.09799
12
citations
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
Will Merrill, Shane Arora, Dirk Groeneveld et al.
NeurIPS 2025spotlightarXiv:2505.23971
5
citations
DsDm: Model-Aware Dataset Selection with Datamodels
Logan Engstrom
ICML 2024spotlightarXiv:2401.12926
QuRating: Selecting High-Quality Data for Training Language Models
Alexander Wettig, Aatmik Gupta, Saumya Malik et al.
ICML 2024spotlightarXiv:2402.09739