ICML 2024 "gradient descent" Papers

11 papers found

Asymptotics of feature learning in two-layer networks after one gradient-step

Hugo Cui, Luca Pesce, Yatin Dandi et al.

ICML 2024spotlightarXiv:2402.04980

Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

Nikhil Vyas, Depen Morwani, Rosie Zhao et al.

ICML 2024spotlightarXiv:2306.08590

Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?

Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi et al.

ICML 2024posterarXiv:2410.08292

Convergence and Trade-Offs in Riemannian Gradient Descent and Riemannian Proximal Point

David Martínez-Rubio, Christophe Roux, Sebastian Pokutta

ICML 2024posterarXiv:2403.10429

Differentiability and Optimization of Multiparameter Persistent Homology

Luis Scoccola, Siddharth Setlur, David Loiseaux et al.

ICML 2024posterarXiv:2406.07224

Interpreting and Improving Diffusion Models from an Optimization Perspective

Frank Permenter, Chenyang Yuan

ICML 2024posterarXiv:2306.04848

Learning Associative Memories with Gradient Descent

Vivien Cabannnes, Berfin Simsek, Alberto Bietti

ICML 2024poster

Non-stationary Online Convex Optimization with Arbitrary Delays

Yuanyu Wan, Chang Yao, Mingli Song et al.

ICML 2024poster

Position: Do pretrained Transformers Learn In-Context by Gradient Descent?

Lingfeng Shen, Aayush Mishra, Daniel Khashabi

ICML 2024poster

Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context

Xiang Cheng, Yuxin Chen, Suvrit Sra

ICML 2024posterarXiv:2312.06528

Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot

Zixuan Wang, Stanley Wei, Daniel Hsu et al.

ICML 2024posterarXiv:2406.06893