ICLR "optimization algorithms" Papers
3 papers found
Deconstructing What Makes a Good Optimizer for Autoregressive Language Models
Rosie Zhao, Depen Morwani, David Brandfonbrener et al.
ICLR 2025poster
Descent with Misaligned Gradients and Applications to Hidden Convexity
Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar et al.
ICLR 2025poster
From Promise to Practice: Realizing High-performance Decentralized Training
Zesen Wang, Jiaojiao Zhang, Xuyang Wu et al.
ICLR 2025posterarXiv:2410.11998
2
citations