"gradient clipping" Papers
6 papers found
Convergence of Distributed Adaptive Optimization with Local Updates
Ziheng Cheng, Margalit Glasgow
ICLR 2025posterarXiv:2409.13155
3
citations
Escaping saddle points without Lipschitz smoothness: the power of nonlinear preconditioning
Alexander Bodard, Panagiotis Patrinos
NeurIPS 2025spotlightarXiv:2509.15817
2
citations
Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping
Zijian Liu, Zhengyuan Zhou
ICLR 2025posterarXiv:2412.19529
23
citations
Delving into Differentially Private Transformer
Youlong Ding, Xueyang Wu, Yining meng et al.
ICML 2024poster
High-Probability Bound for Non-Smooth Non-Convex Stochastic Optimization with Heavy Tails
Langqi Liu, Yibo Wang, Lijun Zhang
ICML 2024poster
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise
Eduard Gorbunov, Abdurakhmon Sadiev, Marina Danilova et al.
ICML 2024poster