"grokking phenomenon" Papers
5 papers found
Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking
Ting Han, Linara Adilova, Henning Petzka et al.
NeurIPS 2025oralarXiv:2509.17738
3
citations
Grokking at the Edge of Numerical Stability
Lucas Prieto, Melih Barsbey, Pedro Mediano et al.
ICLR 2025posterarXiv:2501.04697
17
citations
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Bhavya Vasudeva, Deqing Fu, Tianyi Zhou et al.
ICLR 2025posterarXiv:2403.06925
7
citations
Unveiling the Dynamics of Information Interplay in Supervised Learning
Kun Song, Zhiquan Tan, Bochao Zou et al.
ICML 2024poster
Why Do You Grok? A Theoretical Analysis on Grokking Modular Addition
Mohamad Amin Mohamadi, Zhiyuan Li, Lei Wu et al.
ICML 2024poster