"weight quantization" Papers
7 papers found
Training-Free Activation Sparsity in Large Language Models
James Liu, Pragaash Ponnusamy, Tianle Cai et al.
ICLR 2025posterarXiv:2408.14690
37
citations
A2Q+: Improving Accumulator-Aware Weight Quantization
Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig et al.
ICML 2024poster
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
Yunshan Zhong, Jiawei Hu, You Huang et al.
ICML 2024spotlight
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
Wenshuo Li, Xinghao Chen, Han Shu et al.
ICML 2024poster
Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Zhewei Yao, Xiaoxia Wu, Cheng Li et al.
AAAI 2024paperarXiv:2303.08302
70
citations
Extreme Compression of Large Language Models via Additive Quantization
Vage Egiazarian, Andrei Panferov, Denis Kuznedelev et al.
ICML 2024poster
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
Changhun Lee, Jungyu Jin, Taesu Kim et al.
AAAI 2024paperarXiv:2306.02272
100
citations