Stochastic Optimization
SGD and related optimization methods
Top Conferences
Related Topics (Optimization)
Top Papers
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu, Zhiyuan Li, David Hall et al.
Test-time Alignment of Diffusion Models without Reward Over-optimization
Sunwoo Kim, Minkyu Kim, Dongmin Park
How to Fine-Tune Vision Models with SGD
Ananya Kumar, Ruoqi Shen, Sebastien Bubeck et al.
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
Eduard Gorbunov, Nazarii Tupitsa, Sayantan Choudhury et al.
ASGO: Adaptive Structured Gradient Optimization
Kang An, Yuxing Liu, Rui Pan et al.
Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement
Dominik Grimm, Jonathan Pirnay
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization
Shuoran Jiang, Qingcai Chen, Yang Xiang et al.
B2Opt: Learning to Optimize Black-box Optimization with Little Budget
Xiaobin Li, Kai Wu, Xiaoyu Zhang et al.
Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization
Zhitong Xu, Haitao Wang, Jeff Phillips et al.
Does SGD really happen in tiny subspaces?
Minhak Song, Kwangjun Ahn, Chulhee Yun
AdaGrad under Anisotropic Smoothness
Yuxing Liu, Rui Pan, Tong Zhang
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
Sebastian Sanokowski, Wilhelm Berghammer, Haoyu Wang et al.
Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness
Chenghan Xie, Chenxi Li, Chuwen Zhang et al.
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Juno Kim, Kakei Yamamoto, Kazusato Oko et al.
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
Sara Klein, Simon Weissmann, Leif Döring
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou, Nicolas Loizou
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression
Adam Block, Dylan Foster, Akshay Krishnamurthy et al.
Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization
Anthony Bardou, Patrick Thiran, Thomas Begin
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Springer, Vaishnavh Nagarajan, Aditi Raghunathan
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander Atanasov, Alexandru Meterez, James Simon et al.
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker, Frederick Altrock, Benjamin Risse
DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization
Wenze Chen, Shiyu Huang, Yuan Chiang et al.
Implicit Bias of Spectral Descent and Muon on Multiclass Separable Data
Chen Fan, Mark Schmidt, Christos Thrampoulidis
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
Avery Ma, Yangchen Pan, Amir-massoud Farahmand
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise
Rui Pan, Yuxing Liu, Xiaoyu Wang et al.
Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Artavazd Maranjyan, Alexander Tyurin, Peter Richtarik
Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models
Bingdong Li, Zixiang Di, Yongfan Lu et al.
Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Chenyu Zhang, Xu Chen, Xuan Di
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
Taiwo Adebiyi, Bach Do, Ruda Zhang
Efficient Distributed Optimization under Heavy-Tailed Noise
Su Hyeong Lee, Manzil Zaheer, Tian Li
Distributionally Robust Optimization with Bias and Variance Reduction
Ronak Mehta, Vincent Roulet, Krishna Pillutla et al.
General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization
Kwangjun Ahn, Gagik Magakyan, Ashok Cutkosky
PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization
Mingjing Xu, Peizhong Ju, Jia Liu et al.
Large-Scale Non-convex Stochastic Constrained Distributionally Robust Optimization
Qi Zhang, Yi Zhou, Ashley Prater-Bennette et al.
MAST: model-agnostic sparsified training
Yury Demidovich, Grigory Malinovsky, Egor Shulgin et al.
Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions
Ofir Gaash, Kfir Y. Levy, Yair Carmon
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
Amit Attia, Matan Schliserman, Uri Sherman et al.
Nesterov acceleration in benignly non-convex landscapes
Kanan Gupta, Stephan Wojtowytsch
Direct Distributional Optimization for Provable Alignment of Diffusion Models
Ryotaro Kawata, Kazusato Oko, Atsushi Nitanda et al.
Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness
Yuheng Zhao, Yu-Hu Yan, Kfir Y. Levy et al.
Gradient Multi-Normalization for Efficient LLM Training
Meyer Scetbon, Chao Ma, Wenbo Gong et al.
Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness
Michael Crawshaw, Mingrui Liu
Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent
Jiahuan Wang, Hong Chen
Convergence of Distributed Adaptive Optimization with Local Updates
Ziheng Cheng, Margalit Glasgow
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates
Zhuanghua Liu, Luo Luo, Bryan Kian Hsiang Low
Pareto-Optimality, Smoothness, and Stochasticity in Learning-Augmented One-Max-Search
Ziyad Benomar, Lorenzo Croissant, Vianney Perchet et al.
Aligned Multi Objective Optimization
Yonathan Efroni, Ben Kretzu, Daniel Jiang et al.
Quantum Optimization via Gradient-Based Hamiltonian Descent
Jiaqi Leng, Bin Shi
Regularized Langevin Dynamics for Combinatorial Optimization
Shengyu Feng, Yiming Yang
Global Optimization with a Power-Transformed Objective and Gaussian Smoothing
Chen Xu
Gradient correlation is a key ingredient to accelerate SGD with momentum
Julien Hermant, Marien Renaud, Jean-François Aujol et al.
Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time
Qiang Fu, Andre Wibisono
Towards Faster Decentralized Stochastic Optimization with Communication Compression
Rustem Islamov, Yuan Gao, Sebastian Stich
Second-Order Convergence in Private Stochastic Non-Convex Optimization
Youming Tao, Zuyuan Zhang, Dongxiao Yu et al.
Optimal Rates in Continual Linear Regression via Increasing Regularization
Ran Levinstein, Amit Attia, Matan Schliserman et al.
Long-time asymptotics of noisy SVGD outside the population limit
Victor Priser, PASCAL BIANCHI, Adil Salim
Newton Meets Marchenko-Pastur: Massively Parallel Second-Order Optimization with Hessian Sketching and Debiasing
Elad Romanov, Fangzhao Zhang, Mert Pilanci
Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions
Yuya Hikima, Akiko Takeda
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning
Minyoung Kim, Timothy Hospedales
Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates
Puning Zhao, Jiafei Wu, Zhe Liu et al.
Sample-and-Bound for Non-convex Optimization
Yaoguang Zhai, Zhizhen Qin, Sicun Gao
Coupling-based Convergence Diagnostic and Stepsize Scheme for Stochastic Gradient Descent
Xiang Li, Qiaomin Xie
Incremental Gradient Descent with Small Epoch Counts is Surprisingly Slow on Ill-Conditioned Problems
Yujun Kim, Jaeyoung Cha, Chulhee Yun
Learning Curves of Stochastic Gradient Descent in Kernel Regression
Haihan Zhang, Weicheng Lin, Yuanshi Liu et al.
Consensus Based Stochastic Optimal Control
Liyao Lyu, Jingrun Chen
A Near-Optimal Single-Loop Stochastic Algorithm for Convex Finite-Sum Coupled Compositional Optimization
Bokun Wang, Tianbao Yang
Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
Sebastian Kassing, Simon Weissmann, Leif Döring
SGD with memory: fundamental properties and stochastic acceleration
Dmitry Yarotsky, Maksim Velikanov
Sequential Stochastic Combinatorial Optimization Using Hierarchal Reinforcement Learning
Xinsong Feng, Zihan Yu, Yanhai Xiong et al.
Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling
Yuma Ichikawa, Yamato Arai
Adam Reduces a Unique Form of Sharpness: Theoretical Insights Near the Minimizer Manifold
Xinghan Li, Haodong Wen, Kaifeng Lyu
MGDA Converges under Generalized Smoothness, Provably
Qi Zhang, Peiyao Xiao, Shaofeng Zou et al.
Solving hidden monotone variational inequalities with surrogate losses
Ryan D'Orazio, Danilo Vucetic, Zichu Liu et al.
Exploiting Hidden Symmetry to Improve Objective Perturbation for DP Linear Learners with a Nonsmooth L1-Norm
Du Chen, Geoffrey A. Chua
Nonlinearly Preconditioned Gradient Methods: Momentum and Stochastic Analysis
Konstantinos Oikonomidis, Jan Quan, Panagiotis Patrinos
Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization
Abhishek Roy, Geelon So, Yian Ma
A Near-Optimal Algorithm for Decentralized Convex-Concave Finite-Sum Minimax Optimization
Hongxu Chen, Ke Wei, Haishan Ye et al.
Learning from A Single Markovian Trajectory: Optimality and Variance Reduction
Zhenyu Sun, Ermin Wei
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
Marina Sheshukova, Denis Belomestny, Alain Oliviero Durmus et al.
Asymptotic theory of SGD with a general learning-rate
Or Goldreich, Ziyang Wei, SOHAM BONNERJEE et al.
A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond
Yipeng Li, Xinchen Lyu, Zhenyu Liu
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
Shaocong Ma, Heng Huang
A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees
Yuhao Zhou, Jintao Xu, Bingrui Li et al.
Revisiting Large-Scale Non-convex Distributionally Robust Optimization
Qi Zhang, Yi Zhou, Simon Khan et al.
PROFIT: A Specialized Optimizer for Deep Fine Tuning
Anirudh Chakravarthy, Shuai Zheng, Xin Huang et al.
Revisiting Consensus Error: A Fine-grained Analysis of Local SGD under Second-order Data Heterogeneity
Kumar Kshitij Patel, Ali Zindari, Sebastian Stich et al.
Searching for Optimal Solutions with LLMs via Bayesian Optimization
Dhruv Agarwal, Manoj Ghuhan Arivazhagan, Rajarshi Das et al.
On the Almost Sure Convergence of the Stochastic Three Points Algorithm
Taha EL BAKKALI EL KADI, Omar Saadi
Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport
Ferdinand Genans, Antoine Godichon-Baggioni, François-Xavier Vialard et al.
A Gradient Guided Diffusion Framework for Chance Constrained Programming
Boyang Zhang, Zhiguo Wang, Ya-Feng Liu
Optimistic Online-to-Batch Conversions for Accelerated Convergence and Universality
Yu-Hu Yan, Peng Zhao, Zhi-Hua Zhou
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang
Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping
Zijian Liu, Zhengyuan Zhou
Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise
Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov et al.
Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression
Jingfeng Wu, Pierre Marion, Peter Bartlett
Tight High-Probability Bounds for Nonconvex Heavy-Tailed Scenario under Weaker Assumptions
Weixin An, Yuanyuan Liu, Fanhua Shang et al.
Gaussian Approximation and Concentration of Constant Learning-Rate Stochastic Gradient Descent
Ziyang Wei, Jiaqi Li, Zhipeng Lou et al.
A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models
Joshua Tian Jin Tee, Hee Suk Yoon, Abu Hanif Muhammad Syarubany et al.
Leveraging Variable Sparsity to Refine Pareto Stationarity in Multi-Objective Optimization
Zeou Hu, Yaoliang Yu