NEURIPS 2025 "weight normalization" Papers
2 papers found
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Yonggan Fu, Xin Dong, Shizhe Diao et al.
NEURIPS 2025posterarXiv:2511.18890
Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization
Daniel Palenicek, Florian Vogt, Joe Watson et al.
NEURIPS 2025posterarXiv:2502.07523
8
citations