NeurIPS "algorithmic tasks" Papers
2 papers found
Extrapolation by Association: Length Generalization Transfer In Transformers
Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.
NeurIPS 2025spotlightarXiv:2506.09251
7
citations
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
Pulkit Gopalani, Wei Hu
NeurIPS 2025posterarXiv:2506.13688
1
citations