Shiwei Liu
9
Papers
37
Total Citations
Papers (9)
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
ICLR 2025
22
citations
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications
ICML 2025
15
citations
Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More
ICML 2025
0
citations
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
ICML 2024
0
citations
Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once
ICML 2024
0
citations
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textit{Irreversibly}$ and $\textit{Monotonically}$ Impairs ``Difficult" Downstream Tasks in LLMs
ICML 2024
0
citations
CaM: Cache Merging for Memory-efficient LLMs Inference
ICML 2024
0
citations
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
ICML 2024
0
citations
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
AAAI 2025
0
citations