2025 "training stability" Papers
5 papers found
A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models
Joshua Tian Jin Tee, Hee Suk Yoon, Abu Hanif Muhammad Syarubany et al.
NeurIPS 2025oral
Improving Neural Optimal Transport via Displacement Interpolation
Jaemoo Choi, Yongxin Chen, Jaewoong Choi
ICLR 2025posterarXiv:2410.03783
3
citations
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Oskar van der Wal, Pietro Lesci, Max Müller-Eberstein et al.
ICLR 2025posterarXiv:2503.09543
14
citations
Quantum Doubly Stochastic Transformers
Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.
NeurIPS 2025spotlightarXiv:2504.16275
2
citations
Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks
Giyeong Oh, Woohyun Cho, Siyeol Kim et al.
NeurIPS 2025posterarXiv:2505.11881