NEURIPS 2025 "training dynamics analysis" Papers
6 papers found
Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
Dhruva Karkada, James Simon, Yasaman Bahri et al.
NEURIPS 2025posterarXiv:2502.09863
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi, Fan Nie, Alexandre Alahi et al.
NEURIPS 2025oralarXiv:2506.16029
3
citations
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
Zheng-An Chen, Tao Luo
NEURIPS 2025oralarXiv:2510.06954
1
citations
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
James Michaelov, Roger Levy, Benjamin Bergen
NEURIPS 2025oralarXiv:2510.24963
Less is More: Local Intrinsic Dimensions of Contextual Language Models
Benjamin Matthias Ruppik, Julius von Rohrscheidt, Carel van Niekerk et al.
NEURIPS 2025posterarXiv:2506.01034
Quantitative convergence of trained neural networks to Gaussian processes
Andrea Agazzi, Eloy Mosig García, Dario Trevisan
NEURIPS 2025poster