2024 Poster "efficient inference" Papers
5 papers found
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Junyuan Hong, Jinhao Duan, Chenhui Zhang et al.
ICML 2024posterarXiv:2403.15447
Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks
Cheng Gong, Yao Chen, Qiuyang Luo et al.
ECCV 2024posterarXiv:2407.13986
3
citations
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.
ICML 2024posterarXiv:2403.06082
MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.
ECCV 2024posterarXiv:2311.16567
35
citations
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang, Korawat Tanwisuth, Chengyue Gong et al.
ICML 2024posterarXiv:2405.04513