NeurIPS 2025 "large batch training" Papers
2 papers found
Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training
Minhak Song, Beomhan Baek, Kwangjun Ahn et al.
NeurIPS 2025posterarXiv:2507.09846
2
citations
Understanding outer learning rates in Local SGD
Ahmed Khaled, Satyen Kale, Arthur Douillard et al.
NeurIPS 2025poster