NeurIPS Poster "efficient inference" Papers
3 papers found
Plug-and-Play Context Feature Reuse for Efficient Masked Generation
Xuejie Liu, Anji Liu, Guy Van den Broeck et al.
NeurIPS 2025posterarXiv:2505.19089
3
citations
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang, Hui Chen, Jianchao Tan et al.
NeurIPS 2025posterarXiv:2412.03409
5
citations
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Tianyu Fu, Yi Ge, Yichen You et al.
NeurIPS 2025posterarXiv:2505.21600
11
citations