"adaptive inference" Papers
5 papers found
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
Yiwu Zhong, Zhuoming Liu, Yin Li et al.
ICCV 2025posterarXiv:2412.03248
21
citations
Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability
Divya Jyoti Bajpai, Manjesh Kumar Hanawal
NeurIPS 2025posterarXiv:2509.23666
Dynamic Diffusion Transformer
Wangbo Zhao, Yizeng Han, Jiasheng Tang et al.
ICLR 2025posterarXiv:2410.03456
34
citations
Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs
Hao Kang, Qingru Zhang, Han Cai et al.
NeurIPS 2025spotlightarXiv:2505.19481
4
citations
Flextron: Many-in-One Flexible Large Language Model
Ruisi Cai, Saurav Muralidharan, Greg Heinrich et al.
ICML 2024poster