Poster "vanishing gradients" Papers
5 papers found
Revisiting Glorot Initialization for Long-Range Linear Recurrences
Noga Bar, Mariia Seleznova, Yotam Alexander et al.
NeurIPS 2025posterarXiv:2505.19827
Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks
Giyeong Oh, Woohyun Cho, Siyeol Kim et al.
NeurIPS 2025posterarXiv:2505.11881
Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks
Dongyoung Lim, Sotirios Sabanis
ICML 2024posterarXiv:2105.13937
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
Akhil Kedia, Mohd Abbas Zaidi, Sushil Khyalia et al.
ICML 2024posterarXiv:2403.09635
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
Antonio Orvieto, Soham De, Caglar Gulcehre et al.
ICML 2024posterarXiv:2307.11888