by Jared Tanner Papers
4 papers found
Mind the Gap: a Spectral Analysis of Rank Collapse and Signal Propagation in Attention Layers
Thiziri Nait Saada, Alireza Naderi, Jared Tanner
ICML 2025posterarXiv:2410.07799
Beyond IID weights: sparse and low-rank deep Neural Networks are also Gaussian Processes
Thiziri Nait Saada, Alireza Naderi, Jared Tanner
ICLR 2024posterarXiv:2310.16597
DEEP NEURAL NETWORK INITIALIZATION WITH SPARSITY INDUCING ACTIVATIONS
Ilan Price, Nicholas Daultry Ball, Adam Jones et al.
ICLR 2024posterarXiv:2402.16184
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Yuxin Zhang, Lirui Zhao, Mingbao Lin et al.
ICLR 2024posterarXiv:2310.08915