2025 Poster "efficient inference" Papers
5 papers found
Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
Wenxuan Huang, Zijie Zhai, Yunhang Shen et al.
ICLR 2025posterarXiv:2412.00876
38
citations
Plug-and-Play Context Feature Reuse for Efficient Masked Generation
Xuejie Liu, Anji Liu, Guy Van den Broeck et al.
NeurIPS 2025posterarXiv:2505.19089
3
citations
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang, Hui Chen, Jianchao Tan et al.
NeurIPS 2025posterarXiv:2412.03409
5
citations
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Tianyu Fu, Yi Ge, Yichen You et al.
NeurIPS 2025posterarXiv:2505.21600
11
citations
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung, Byung Cheol Song
CVPR 2025posterarXiv:2504.04747
1
citations