Poster "critical batch size" Papers
2 papers found
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang, Depen Morwani, Nikhil Vyas et al.
ICLR 2025posterarXiv:2410.21676
37
citations
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.
NeurIPS 2025posterarXiv:2505.13738
15
citations