Poster "training dynamics analysis" Papers
5 papers found
A Theoretical Analysis of Self-Supervised Learning for Vision Transformers
Yu Huang, Zixin Wen, Yuejie Chi et al.
ICLR 2025posterarXiv:2403.02233
3
citations
Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
Dhruva Karkada, James Simon, Yasaman Bahri et al.
NEURIPS 2025posterarXiv:2502.09863
Less is More: Local Intrinsic Dimensions of Contextual Language Models
Benjamin Matthias Ruppik, Julius von Rohrscheidt, Carel van Niekerk et al.
NEURIPS 2025posterarXiv:2506.01034
On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Bingrui Li, Wei Huang, Andi Han et al.
ICLR 2025posterarXiv:2410.04870
9
citations
Quantitative convergence of trained neural networks to Gaussian processes
Andrea Agazzi, Eloy Mosig García, Dario Trevisan
NEURIPS 2025poster