Stochastic Optimization
SGD and related optimization methods
Related Topics (Optimization)
Top Papers
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu, Zhiyuan Li, David Hall et al.
Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse Problems
Hyungjin Chung, Suhyeon Lee, Jong Chul YE
Offline Actor-Critic for Average Reward MDPs
William Powell, Jeongyeol Kwon, Qiaomin Xie et al.
End-to-End Rate-Distortion Optimized 3D Gaussian Representation
Henan Wang, Hanxin Zhu, Tianyu He et al.
The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing
Shen Nie, Hanzhong Guo, Cheng Lu et al.
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally
Qiuhong Shen, Xingyi Yang, Xinchao Wang
Test-time Alignment of Diffusion Models without Reward Over-optimization
Sunwoo Kim, Minkyu Kim, Dongmin Park
Stable Neural Stochastic Differential Equations in Analyzing Irregular Time Series Data
YongKyung Oh, Dongyoung Lim, Sungil Kim
How to Fine-Tune Vision Models with SGD
Ananya Kumar, Ruoqi Shen, Sebastien Bubeck et al.
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
Eduard Gorbunov, Nazarii Tupitsa, Sayantan Choudhury et al.
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Junkang Wu, Yuexiang Xie, Zhengyi Yang et al.
Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement
Dominik Grimm, Jonathan Pirnay
ASGO: Adaptive Structured Gradient Optimization
Kang An, Yuxing Liu, Rui Pan et al.
Quasi-Monte Carlo for 3D Sliced Wasserstein
Khai Nguyen, Nicola Bariletto, Nhat Ho
Runtime Analysis of the SMS-EMOA for Many-Objective Optimization
Weijie Zheng, Benjamin Doerr
Self-Consistency Preference Optimization
Archiki Prasad, Weizhe Yuan, Richard Yuanzhe Pang et al.
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini, Pierre Ablin, David Grangier
Implicit bias of SGD in $L_2$-regularized linear DNNs: One-way jumps from high to low rank
Zihan Wang, Arthur Jacot
Domain Randomization via Entropy Maximization
Gabriele Tiboni, Pascal Klink, Jan Peters et al.
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization
Shuoran Jiang, Qingcai Chen, Yang Xiang et al.
Constrained Bayesian Optimization under Partial Observations: Balanced Improvements and Provable Convergence
Shengbo Wang, Ke Li
Adversarial Adaptive Sampling: Unify PINN and Optimal Transport for the Approximation of PDEs
Kejun Tang, Jiayu Zhai, Xiaoliang Wan et al.
Temporally and Distributionally Robust Optimization for Cold-Start Recommendation
Xinyu Lin, Wenjie Wang, Jujia Zhao et al.
B2Opt: Learning to Optimize Black-box Optimization with Little Budget
Xiaobin Li, Kai Wu, Xiaoyu Zhang et al.
Understanding Optimization in Deep Learning with Central Flows
Jeremy Cohen, Alex Damian, Ameet Talwalkar et al.
Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization
Zhitong Xu, Haitao Wang, Jeff Phillips et al.
No Preference Left Behind: Group Distributional Preference Optimization
Binwei Yao, Zefan Cai, Yun-Shiuan Chuang et al.
Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling
Wei Guo, Molei Tao, Yongxin Chen
Grokking at the Edge of Numerical Stability
Lucas Prieto, Melih Barsbey, Pedro Mediano et al.
Does SGD really happen in tiny subspaces?
Minhak Song, Kwangjun Ahn, Chulhee Yun
Learning to Optimize Permutation Flow Shop Scheduling via Graph-Based Imitation Learning
Longkang Li, Siyuan Liang, Zihao Zhu et al.
Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Jun Chen, Haishan Ye, Mengmeng Wang et al.
Adaptive teachers for amortized samplers
Minsu Kim, Sanghyeok Choi, Taeyoung Yun et al.
FunBO: Discovering Acquisition Functions for Bayesian Optimization with FunSearch
Virginia Aglietti, Ira Ktena, Jessica Schrouff et al.
Deep Distributed Optimization for Large-Scale Quadratic Programming
Augustinos Saravanos, Hunter Kuperman, Alex Oshin et al.
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
Sebastian Sanokowski, Wilhelm Berghammer, Haoyu Wang et al.
Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization
XiangCheng Zhang, Fang Kong, Baoxiang Wang et al.
AdaGrad under Anisotropic Smoothness
Yuxing Liu, Rui Pan, Tong Zhang
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Juno Kim, Kakei Yamamoto, Kazusato Oko et al.
Light Schrödinger Bridge
Alexander Korotin, Nikita Gushchin, Evgeny Burnaev
Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness
Chenghan Xie, Chenxi Li, Chuwen Zhang et al.
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
Sara Klein, Simon Weissmann, Leif Döring
Mitigating the Curse of Dimensionality for Certified Robustness via Dual Randomized Smoothing
Song Xia, Yi Yu, Jiang Xudong et al.
In Search of Adam’s Secret Sauce
Antonio Orvieto, Robert Gower
SDGMNet: Statistic-Based Dynamic Gradient Modulation for Local Descriptor Learning
Yuxin Deng, Jiayi Ma
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression
Adam Block, Dylan Foster, Akshay Krishnamurthy et al.
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Zeman Li, Xinwei Zhang, Peilin Zhong et al.
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou, Nicolas Loizou
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker, Frederick Altrock, Benjamin Risse
Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach
Haotian Ju, Hongyang Zhang, Dongyue Li
On the Crucial Role of Initialization for Matrix Factorization
Bingcong Li, Liang Zhang, Aryan Mokhtari et al.
Training-Free Guidance Beyond Differentiability: Scalable Path Steering with Tree Search in Diffusion and Flow Models
Yingqing Guo, Yukang Yang, Hui Yuan et al.
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Springer, Vaishnavh Nagarajan, Aditi Raghunathan
Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization
Anthony Bardou, Patrick Thiran, Thomas Begin
Cumulative Regret Analysis of the Piyavskii–Shubert Algorithm and Its Variants for Global Optimization
Kaan Gokcesu, Hakan Gökcesu
Variational Inference for SDEs Driven by Fractional Noise
Rembert Daems, Manfred Opper, Guillaume Crevecoeur et al.
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander Atanasov, Alexandru Meterez, James Simon et al.
Neural structure learning with stochastic differential equations
Benjie Wang, Joel Jennings, Wenbo Gong
Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
Kiyoung Seong, Seonghyun Park, Seonghwan Kim et al.
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment
Yunhong Lu, Qichao Wang, Hengyuan Cao et al.
Improved Active Learning via Dependent Leverage Score Sampling
Atsushi Shimizu, Xiaoou Cheng, Christopher Musco et al.
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation
Zhao Song, Mingquan Ye, Junze Yin et al.
Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics
Omar Chehab, Anna Korba, Austin Stromme et al.
On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Bingrui Li, Wei Huang, Andi Han et al.
Deep Nonlinear Sufficient Dimension Reduction
Yinfeng Chen, Yuling Jiao, Rui Qiu et al.
DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization
Wenze Chen, Shiyu Huang, Yuan Chiang et al.
Pareto Front-Diverse Batch Multi-Objective Bayesian Optimization
Alaleh Ahmadianshalchi, Syrine Belakaria, Janardhan Rao Doppa
Few for Many: Tchebycheff Set Scalarization for Many-Objective Optimization
Xi Lin, Yilu Liu, Xiaoyuan Zhang et al.
Implicit Bias of Spectral Descent and Muon on Multiclass Separable Data
Chen Fan, Mark Schmidt, Christos Thrampoulidis
Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function
Maria-Florina Balcan, Anh Nguyen, Dravyansh Sharma
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
Gang Li, Ming Lin, Tomer Galanti et al.
Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction
Guowei Xu, Jiale Tao, Wen Li et al.
Improved Metric Distortion via Threshold Approvals
Elliot Anshelevich, Aris Filos-Ratsikas, Christopher Jerrett et al.
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
Yun Qu, Cheems Wang, Yixiu Mao et al.
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation
Wenliang Zhao, Haolin Wang, Jie Zhou et al.
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son, William Bankes, Sayak Ray Chowdhury et al.
Offline-to-Online Hyperparameter Transfer for Stochastic Bandits
Dravyansh Sharma, Arun Suggala
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
Avery Ma, Yangchen Pan, Amir-massoud Farahmand
Sharpness-Aware Minimization: General Analysis and Improved Rates
Dimitris Oikonomou, Nicolas Loizou
On the Limitations of Temperature Scaling for Distributions with Overlaps
Muthu Chidambaram, Rong Ge
Colored Noise in PPO: Improved Exploration and Performance through Correlated Action Sampling
Jakob Hollenstein, Georg Martius, Justus Piater
Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
Yang Cai, Gabriele Farina, Julien Grand-Clément et al.
Robust and Conjugate Spatio-Temporal Gaussian Processes
William Laplante, Matias Altamirano, Andrew Duncan et al.
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
POp-GS: Next Best View in 3D-Gaussian Splatting with P-Optimality
Joey Wilson, Marcelino M. de Almeida, Sachit Mahajan et al.
Universal generalization guarantees for Wasserstein distributionally robust models
Tam Le, Jerome Malick
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Tennison Liu, Nicolas Huynh, Mihaela van der Schaar
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum
SARIT KHIRIRAT, Abdurakhmon Sadiev, Artem Riabinin et al.
Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Artavazd Maranjyan, Alexander Tyurin, Peter Richtarik
Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning
Yang Xu, Washim Mondal, Vaneet Aggarwal
Two-timescale Extragradient for Finding Local Minimax Points
Jiseok Chae, Kyuwon Kim, Donghwan Kim
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise
Rui Pan, Yuxing Liu, Xiaoyu Wang et al.
Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback
Riccardo Della Vecchia, Debabrota Basu
Error Bounds for Gaussian Process Regression Under Bounded Support Noise with Applications to Safety Certification
Robert Reed, Luca Laurenti, Morteza Lahijanian
Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models
Bingdong Li, Zixiang Di, Yongfan Lu et al.
Regret Analysis of Repeated Delegated Choice
Suho Shin, Keivan Rezaei, Mohammad Hajiaghayi et al.
Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Chenyu Zhang, Xu Chen, Xuan Di
Privacy amplification by random allocation
Moshe Shenfeld, Vitaly Feldman
Towards Robustness and Explainability of Automatic Algorithm Selection
Xingyu Wu, Jibin Wu, Yu Zhou et al.