Poster "low-bit quantization" Papers

13 papers found

CodeGEMM: A Codebook-Centric Approach to Efficient GEMM in Quantized LLMs

Gunho Park, Jeongin Bae, Byeongwook Kim et al.

NeurIPS 2025posterarXiv:2512.17970

Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning

Maosen Zhao, Pengtao Chen, Chong Yu et al.

CVPR 2025posterarXiv:2505.21591
3
citations

RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models

Zukang Xu, Xing Hu, Qiang Wu et al.

NeurIPS 2025posterarXiv:2510.01240

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

Rasoul Shafipour, David Harrison, Maxwell Horton et al.

ICLR 2025posterarXiv:2410.10714
2
citations

ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba

Juncan Deng, Shuaiting Li, Zeyu Wang et al.

ICCV 2025posterarXiv:2503.09509

Accurate LoRA-Finetuning Quantization of LLMs via Information Retention

Haotong Qin, Xudong Ma, Xingyu Zheng et al.

ICML 2024poster

Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs

Yeonhong Park, Jake Hyun, SangLyul Cho et al.

ICML 2024poster

BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization

Lancheng Zou, Wenqian Zhao, Shuo Yin et al.

ICML 2024poster

Extreme Compression of Large Language Models via Additive Quantization

Vage Egiazarian, Andrei Panferov, Denis Kuznedelev et al.

ICML 2024poster

FrameQuant: Flexible Low-Bit Quantization for Transformers

Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.

ICML 2024poster

GenQ: Quantization in Low Data Regimes with Generative Synthetic Data

YUHANG LI, Youngeun Kim, Donghyun Lee et al.

ECCV 2024posterarXiv:2312.05272
6
citations

Sharpness-Aware Data Generation for Zero-shot Quantization

Hoang Dung, Cuong Pham, Trung Le et al.

ICML 2024poster

Towards Robust Full Low-bit Quantization of Super Resolution Networks

Denis Makhov, Irina Zhelavskaya, Ruslan Ostapets et al.

ECCV 2024poster
1
citations