Oral "inference acceleration" Papers
6 papers found
Conference
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lyu, Chenyang Si, Junhao Song et al.
ICLR 2025oralarXiv:2410.19355
58
citations
HoliTom: Holistic Token Merging for Fast Video Large Language Models
Kele Shao, Keda TAO, Can Qin et al.
NEURIPS 2025oralarXiv:2505.21334
20
citations
SLMRec: Distilling Large Language Models into Small for Sequential Recommendation
Wujiang Xu, Qitian Wu, Zujie Liang et al.
ICLR 2025oralarXiv:2405.17890
18
citations
VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching
Siyu Xu, Yunke Wang, Chenghao Xia et al.
NEURIPS 2025oralarXiv:2502.02175
27
citations
How Deep Do We Need: Accelerating Training and Inference of Neural ODEs via Control Perspective
Keyan Miao, Konstantinos Gatsis
ICML 2024oral
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates
Arshia Afzal, Grigorios Chrysos, Volkan Cevher et al.
ICML 2024oralarXiv:2406.16906
12
citations