"stochastic gradient descent" Papers
32 papers found
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
Gaussian Approximation and Concentration of Constant Learning-Rate Stochastic Gradient Descent
Ziyang Wei, Jiaqi Li, Zhipeng Lou et al.
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
Marina Sheshukova, Denis Belomestny, Alain Oliviero Durmus et al.
Online robust locally differentially private learning for nonparametric regression
Chenfei Gu, Qiangqiang Zhang, Ting Li et al.
Online Statistical Inference in Decision Making with Matrix Context
Qiyu Han, Will Wei Sun, Yichen Zhang
Optimal Rates in Continual Linear Regression via Increasing Regularization
Ran Levinstein, Amit Attia, Matan Schliserman et al.
A Doubly Recursive Stochastic Compositional Gradient Descent Method for Federated Multi-Level Compositional Optimization
Hongchang Gao
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Amit Peleg, Matthias Hein
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.
Delving into the Convergence of Generalized Smooth Minimax Optimization
Wenhan Xian, Ziyi Chen, Heng Huang
Demystifying SGD with Doubly Stochastic Gradients
Kyurae Kim, Joohwan Ko, Yian Ma et al.
Double Stochasticity Gazes Faster: Snap-Shot Decentralized Stochastic Gradient Tracking Methods
Hao Di, Haishan Ye, Xiangyu Chang et al.
Efficient Online Set-valued Classification with Bandit Feedback
Zhou Wang, Xingye Qiao
Generalization Analysis of Stochastic Weight Averaging with General Sampling
Wang Peng, Li Shen, Zerui Tao et al.
How Private are DP-SGD Implementations?
Lynn Chua, Badih Ghazi, Pritish Kamath et al.
Implicit Compressibility of Overparametrized Neural Networks Trained with Heavy-Tailed SGD
Yijun Wan, Melih Barsbey, Abdellatif Zaidi et al.
Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm
Batiste Le Bars, Aurélien Bellet, Marc Tommasi et al.
LPGD: A General Framework for Backpropagation through Embedded Optimization Layers
Anselm Paulus, Georg Martius, Vit Musil
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp, Ruben Ohana, Michael Eickenberg et al.
On Convergence of Incremental Gradient for Non-convex Smooth Functions
Anastasiia Koloskova, Nikita Doikov, Sebastian Stich et al.
Online Learning and Information Exponents: The Importance of Batch size & Time/Complexity Tradeoffs
Luca Arnaboldi, Yatin Dandi, FLORENT KRZAKALA et al.
On the Generalization of Stochastic Gradient Descent with Momentum
Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher et al.
Plug-and-Play image restoration with Stochastic deNOising REgularization
Marien Renaud, Jean Prost, Arthur Leclaire et al.
Random features models: a way to study the success of naive imputation
Alexis Ayme, Claire Boyer, Aymeric Dieuleveut et al.
Random Scaling and Momentum for Non-smooth Non-convex Optimization
Qinzi Zhang, Ashok Cutkosky
Sliding Down the Stairs: How Correlated Latent Variables Accelerate Learning with Neural Networks
Lorenzo Bardone, Sebastian Goldt
Sparse Variational Student-t Processes
Jian Xu, Delu Zeng
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
Ming Yang, Xiyuan Wei, Tianbao Yang et al.
The Role of Learning Algorithms in Collective Action
Omri Ben-Dov, Jake Fawkes, Samira Samadi et al.
Tuning-Free Stochastic Optimization
Ahmed Khaled, Chi Jin
Understanding Forgetting in Continual Learning with Linear Regression
Meng Ding, Kaiyi Ji, Di Wang et al.
What is the Long-Run Distribution of Stochastic Gradient Descent? A Large Deviations Analysis
Waïss Azizian, Franck Iutzeler, Jérôme Malick et al.