2025 "gradient descent" Papers
4 papers found
Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time
Qiang Fu, Andre Wibisono
NeurIPS 2025spotlightarXiv:2505.12553
2
citations
Learning High-Degree Parities: The Crucial Role of the Initialization
Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła et al.
ICLR 2025posterarXiv:2412.04910
3
citations
Transformers are almost optimal metalearners for linear classification
Roey Magen, Gal Vardi
NeurIPS 2025posterarXiv:2510.19797
1
citations
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Jianhao Huang, Zixuan Wang, Jason Lee
ICLR 2025posterarXiv:2502.21212
18
citations