ICML 2024 "model quantization" Papers
7 papers found
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Haotong Qin, Xudong Ma, Xingyu Zheng et al.
ICML 2024poster
BiE: Bi-Exponent Block Floating-Point for Large Language Models Quantization
Lancheng Zou, Wenqian Zhao, Shuo Yin et al.
ICML 2024poster
MGit: A Model Versioning and Management System
Wei Hao, Daniel Mendoza, Rafael Mendes et al.
ICML 2024poster
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
Yuexiao Ma, Huixia Li, Xiawu Zheng et al.
ICML 2024poster
Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation
Boheng Li, Yishuo Cai, Jisong Cai et al.
ICML 2024poster
SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic
Liulu He, yufei zhao, rui gao et al.
ICML 2024poster
Test-Time Model Adaptation with Only Forward Passes
Shuaicheng Niu, Chunyan Miao, Guohao Chen et al.
ICML 2024poster