ICML "gradient descent" Papers
11 papers found
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui, Luca Pesce, Yatin Dandi et al.
ICML 2024spotlightarXiv:2402.04980
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
Nikhil Vyas, Depen Morwani, Rosie Zhao et al.
ICML 2024spotlightarXiv:2306.08590
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi et al.
ICML 2024posterarXiv:2410.08292
Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point
David Martínez-Rubio, Christophe Roux, Sebastian Pokutta
ICML 2024posterarXiv:2403.10429
Differentiability and Optimization of Multiparameter Persistent Homology
Luis Scoccola, Siddharth Setlur, David Loiseaux et al.
ICML 2024posterarXiv:2406.07224
Interpreting and Improving Diffusion Models from an Optimization Perspective
Frank Permenter, Chenyang Yuan
ICML 2024posterarXiv:2306.04848
Learning Associative Memories with Gradient Descent
Vivien Cabannnes, Berfin Simsek, Alberto Bietti
ICML 2024poster
Non-stationary Online Convex Optimization with Arbitrary Delays
Yuanyu Wan, Chang Yao, Mingli Song et al.
ICML 2024poster
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Lingfeng Shen, Aayush Mishra, Daniel Khashabi
ICML 2024poster
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
Xiang Cheng, Yuxin Chen, Suvrit Sra
ICML 2024posterarXiv:2312.06528
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Zixuan Wang, Stanley Wei, Daniel Hsu et al.
ICML 2024posterarXiv:2406.06893