2025 "quantization" Papers
2 papers found
QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models
Yutong Wang, Haiyu Wang, Sai Qian Zhang
NeurIPS 2025spotlightarXiv:2510.16292
1
citations
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov, Kushal Tirumala, Hassan Shapourian et al.
ICLR 2025posterarXiv:2403.17887
160
citations