Poster "computational cost reduction" Papers
11 papers found
Attribution-Driven Adaptive Token Pruning for Transformers
YAOYAO YAN, Hui Yu, Weizhi Xu
NEURIPS 2025poster
Diffusion on Demand: Selective Caching and Modulation for Efficient Generation
Hee Min Choi, Hyoa Kang, Dokwan Oh et al.
NEURIPS 2025poster
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.
ICLR 2025posterarXiv:2502.13595
74
citations
OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models
Huanpeng Chu, Wei Wu, Guanyu Feng et al.
ICCV 2025posterarXiv:2508.16212
6
citations
Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models
Wei Suo, Ji Ma, Mengyang Sun et al.
ICCV 2025posterarXiv:2412.06458
1
citations
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers
Qianhao Yuan, Qingyu Zhang, yanjiang liu et al.
ICCV 2025posterarXiv:2504.00502
SMRS: advocating a unified reporting standard for surrogate models in the artificial intelligence era.
Elizaveta Semenova, Siobhan Mackenzie Hall, Timothy James Hitge et al.
NEURIPS 2025posterarXiv:2502.06753
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training
Felix Krause, Timy Phan, Ming Gui et al.
ICCV 2025posterarXiv:2501.04765
10
citations
Zero-Shot Vision Encoder Grafting via LLM Surrogates
Kaiyu Yue, Vasu Singla, Menglin Jia et al.
ICCV 2025posterarXiv:2505.22664
Accelerating PDE Data Generation via Differential Operator Action in Solution Space
huanshuo dong, Hong Wang, Haoyang Liu et al.
ICML 2024posterarXiv:2402.05957
Online Cascade Learning for Efficient Inference over Streams
Lunyiu Nie, Zhimin Ding, Erdong Hu et al.
ICML 2024posterarXiv:2402.04513