"efficient inference" Papers
7 papers found
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung, Byung Cheol Song
CVPR 2025posterarXiv:2504.04747
1
citations
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Ziqian Zeng, Yihuai Hong, Hongliang Dai et al.
AAAI 2024paperarXiv:2312.11882
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Junyuan Hong, Jinhao Duan, Chenhui Zhang et al.
ICML 2024poster
Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks
Cheng Gong, Yao Chen, Qiuyang Luo et al.
ECCV 2024posterarXiv:2407.13986
3
citations
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.
ICML 2024poster
MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.
ECCV 2024posterarXiv:2311.16567
35
citations
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang, Korawat Tanwisuth, Chengyue Gong et al.
ICML 2024poster