2025 Poster "training dynamics" Papers
6 papers found
Attention layers provably solve single-location regression
Pierre Marion, Raphaël Berthier, Gérard Biau et al.
ICLR 2025posterarXiv:2410.01537
10
citations
Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation
Muquan Li, Hang Gou, Dongyang Zhang et al.
NeurIPS 2025posterarXiv:2510.04838
1
citations
Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts
Chaitanya Kapoor, Sudhanshu Srivastava, Meenakshi Khosla
NeurIPS 2025posterarXiv:2502.18710
1
citations
Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
Binghui Li, Zhixuan Pan, Kaifeng Lyu et al.
ICLR 2025posterarXiv:2410.10322
On the Feature Learning in Diffusion Models
Andi Han, Wei Huang, Yuan Cao et al.
ICLR 2025posterarXiv:2412.01021
13
citations
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Jianhao Huang, Zixuan Wang, Jason Lee
ICLR 2025posterarXiv:2502.21212
18
citations