"quantization" Papers
4 papers found
QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models
Yutong Wang, Haiyu Wang, Sai Qian Zhang
NeurIPS 2025spotlightarXiv:2510.16292
1
citations
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov, Kushal Tirumala, Hassan Shapourian et al.
ICLR 2025posterarXiv:2403.17887
160
citations
Compressing Large Language Models by Joint Sparsification and Quantization
Jinyang Guo, Jianyu Wu, Zining Wang et al.
ICML 2024poster
Fed-QSSL: A Framework for Personalized Federated Learning under Bitwidth and Data Heterogeneity
Yiyue Chen, Haris Vikalo, Chianing Wang
AAAI 2024paperarXiv:2312.13380
13
citations