ICML 2024 "post-training quantization" Papers
9 papers found
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Yeonhong Park, Jake Hyun, SangLyul Cho et al.
ICML 2024poster
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Wei Huang, Yangdong Liu, Haotong Qin et al.
ICML 2024poster
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
Yunshan Zhong, Jiawei Hu, You Huang et al.
ICML 2024spotlight
Evaluating Quantized Large Language Models
Shiyao Li, Xuefei Ning, Luning Wang et al.
ICML 2024poster
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.
ICML 2024poster
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang, Jianyi Cheng, George Constantinides et al.
ICML 2024poster
Outlier-aware Slicing for Post-Training Quantization in Vision Transformer
Yuexiao Ma, Huixia Li, Xiawu Zheng et al.
ICML 2024poster
QuIP$\#$: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
Albert Tseng, Jerry Chee, Qingyao Sun et al.
ICML 2024poster
SqueezeLLM: Dense-and-Sparse Quantization
Sehoon Kim, Coleman Hooper, Amir Gholaminejad et al.
ICML 2024poster