2025 "communication-efficient training" Papers
2 papers found
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary Charles, Gabriel Teston, Lucio Dery et al.
NeurIPS 2025spotlightarXiv:2503.09799
12
citations
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob, Lorenzo Sani, Meghdad Kurmanji et al.
ICLR 2025posterarXiv:2410.05021
2
citations