Poster "low-bit quantization" Papers
13 papers found
CodeGEMM: A Codebook-Centric Approach to Efficient GEMM in Quantized LLMs
Gunho Park, Jeongin Bae, Byeongwook Kim et al.
NeurIPS 2025posterarXiv:2512.17970
Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
Maosen Zhao, Pengtao Chen, Chong Yu et al.
CVPR 2025posterarXiv:2505.21591
3
citations
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
Zukang Xu, Xing Hu, Qiang Wu et al.
NeurIPS 2025posterarXiv:2510.01240
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
Rasoul Shafipour, David Harrison, Maxwell Horton et al.
ICLR 2025posterarXiv:2410.10714
2
citations
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng, Shuaiting Li, Zeyu Wang et al.
ICCV 2025posterarXiv:2503.09509
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Haotong Qin, Xudong Ma, Xingyu Zheng et al.
ICML 2024poster
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Yeonhong Park, Jake Hyun, SangLyul Cho et al.
ICML 2024poster
BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
Lancheng Zou, Wenqian Zhao, Shuo Yin et al.
ICML 2024poster
Extreme Compression of Large Language Models via Additive Quantization
Vage Egiazarian, Andrei Panferov, Denis Kuznedelev et al.
ICML 2024poster
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.
ICML 2024poster
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
YUHANG LI, Youngeun Kim, Donghyun Lee et al.
ECCV 2024posterarXiv:2312.05272
6
citations
Sharpness-Aware Data Generation for Zero-shot Quantization
Hoang Dung, Cuong Pham, Trung Le et al.
ICML 2024poster
Towards Robust Full Low-bit Quantization of Super Resolution Networks
Denis Makhov, Irina Zhelavskaya, Ruslan Ostapets et al.
ECCV 2024poster
1
citations