Poster "gradient descent" Papers
13 papers found
Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification
Hyunji Jung, Hanseul Cho, Chulhee Yun
ICLR 2025posterarXiv:2504.12712
4
citations
Learning High-Degree Parities: The Crucial Role of the Initialization
Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła et al.
ICLR 2025posterarXiv:2412.04910
3
citations
Transformers are almost optimal metalearners for linear classification
Roey Magen, Gal Vardi
NeurIPS 2025posterarXiv:2510.19797
1
citations
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Jianhao Huang, Zixuan Wang, Jason Lee
ICLR 2025posterarXiv:2502.21212
18
citations
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi et al.
ICML 2024poster
Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point
David Martínez-Rubio, Christophe Roux, Sebastian Pokutta
ICML 2024poster
Differentiability and Optimization of Multiparameter Persistent Homology
Luis Scoccola, Siddharth Setlur, David Loiseaux et al.
ICML 2024poster
Interpreting and Improving Diffusion Models from an Optimization Perspective
Frank Permenter, Chenyang Yuan
ICML 2024poster
Learning Associative Memories with Gradient Descent
Vivien Cabannnes, Berfin Simsek, Alberto Bietti
ICML 2024poster
Non-stationary Online Convex Optimization with Arbitrary Delays
Yuanyu Wan, Chang Yao, Mingli Song et al.
ICML 2024poster
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Lingfeng Shen, Aayush Mishra, Daniel Khashabi
ICML 2024poster
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
Xiang Cheng, Yuxin Chen, Suvrit Sra
ICML 2024poster
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Zixuan Wang, Stanley Wei, Daniel Hsu et al.
ICML 2024poster