ICML 2024 "batch size effects" Papers
2 papers found
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
Nikhil Vyas, Depen Morwani, Rosie Zhao et al.
ICML 2024spotlightarXiv:2306.08590
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.
ICML 2024posterarXiv:2306.04815