NEURIPS "activation function analysis" Papers
2 papers found
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
NEURIPS 2025posterarXiv:2504.19983
15
citations
SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures
Julian Kranz, Davide Gallon, Steffen Dereich et al.
NEURIPS 2025posterarXiv:2505.09572
3
citations