Lu Yin
9
Papers
38
Total Citations
Papers (9)
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
ICLR 2025
22
citations
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications
ICML 2025
15
citations
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
NeurIPS 2025arXiv
1
citations
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
ICML 2024
0
citations
MagShield: Towards Better Robustness in Sparse Inertial Motion Capture Under Magnetic Disturbances
ICCV 2025
0
citations
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
ICML 2024
0
citations
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textit{Irreversibly}$ and $\textit{Monotonically}$ Impairs ``Difficult" Downstream Tasks in LLMs
ICML 2024
0
citations
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
NeurIPS 2021
0
citations
Dynamic Sparsity Is Channel-Level Sparsity Learner
NeurIPS 2023
0
citations