2025 "layer normalization" Papers
2 papers found
Impact of Layer Norm on Memorization and Generalization in Transformers
Rishi Singhal, Jung-Eun Kim
NeurIPS 2025posterarXiv:2511.10566
1
citations
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec, Felix Dangel, Sidak Pal Singh
ICLR 2025posterarXiv:2410.10986
10
citations