"computational cost reduction" Papers

10 papers found

Attribution-Driven Adaptive Token Pruning for Transformers

YAOYAO YAN, Hui Yu, Weizhi Xu

NeurIPS 2025poster

Diffusion on Demand: Selective Caching and Modulation for Efficient Generation

Hee Min Choi, Hyoa Kang, Dokwan Oh et al.

NeurIPS 2025poster

MMTEB: Massive Multilingual Text Embedding Benchmark

Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.

ICLR 2025posterarXiv:2502.13595
74
citations

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models

Huanpeng Chu, Wei Wu, Guanyu Feng et al.

ICCV 2025posterarXiv:2508.16212
6
citations

Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models

Wei Suo, Ji Ma, Mengyang Sun et al.

ICCV 2025posterarXiv:2412.06458
1
citations

Reasoning Planning for Language Models

Ngoc Bao Nguyen, Trung Hieu Nguyen, Ruifeng She et al.

NeurIPS 2025spotlightarXiv:2511.00521

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training

Felix Krause, Timy Phan, Ming Gui et al.

ICCV 2025posterarXiv:2501.04765
10
citations

Accelerating PDE Data Generation via Differential Operator Action in Solution Space

huanshuo dong, Hong Wang, Haoyang Liu et al.

ICML 2024posterarXiv:2402.05957

Online Cascade Learning for Efficient Inference over Streams

Lunyiu Nie, Zhimin Ding, Erdong Hu et al.

ICML 2024posterarXiv:2402.04513

Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints

Xiaobo Xia, Jiale Liu, Shaokun Zhang et al.

ICML 2024spotlightarXiv:2311.08675