Poster "computational cost reduction" Papers

11 papers found

Attribution-Driven Adaptive Token Pruning for Transformers

YAOYAO YAN, Hui Yu, Weizhi Xu

NEURIPS 2025poster

Diffusion on Demand: Selective Caching and Modulation for Efficient Generation

Hee Min Choi, Hyoa Kang, Dokwan Oh et al.

NEURIPS 2025poster

MMTEB: Massive Multilingual Text Embedding Benchmark

Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.

ICLR 2025posterarXiv:2502.13595
74
citations

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models

Huanpeng Chu, Wei Wu, Guanyu Feng et al.

ICCV 2025posterarXiv:2508.16212
6
citations

Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models

Wei Suo, Ji Ma, Mengyang Sun et al.

ICCV 2025posterarXiv:2412.06458
1
citations

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers

Qianhao Yuan, Qingyu Zhang, yanjiang liu et al.

ICCV 2025posterarXiv:2504.00502

SMRS: advocating a unified reporting standard for surrogate models in the artificial intelligence era.

Elizaveta Semenova, Siobhan Mackenzie Hall, Timothy James Hitge et al.

NEURIPS 2025posterarXiv:2502.06753

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training

Felix Krause, Timy Phan, Ming Gui et al.

ICCV 2025posterarXiv:2501.04765
10
citations

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Kaiyu Yue, Vasu Singla, Menglin Jia et al.

ICCV 2025posterarXiv:2505.22664

Accelerating PDE Data Generation via Differential Operator Action in Solution Space

huanshuo dong, Hong Wang, Haoyang Liu et al.

ICML 2024posterarXiv:2402.05957

Online Cascade Learning for Efficient Inference over Streams

Lunyiu Nie, Zhimin Ding, Erdong Hu et al.

ICML 2024posterarXiv:2402.04513