2025 Oral "inference acceleration" Papers
2 papers found
HoliTom: Holistic Token Merging for Fast Video Large Language Models
Kele Shao, Keda TAO, Can Qin et al.
NeurIPS 2025oralarXiv:2505.21334
18
citations
VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching
Siyu Xu, Yunke Wang, Chenghao Xia et al.
NeurIPS 2025oralarXiv:2502.02175
27
citations