"computational cost reduction" Papers
10 papers found
Attribution-Driven Adaptive Token Pruning for Transformers
YAOYAO YAN, Hui Yu, Weizhi Xu
NeurIPS 2025poster
Diffusion on Demand: Selective Caching and Modulation for Efficient Generation
Hee Min Choi, Hyoa Kang, Dokwan Oh et al.
NeurIPS 2025poster
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.
ICLR 2025posterarXiv:2502.13595
74
citations
OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models
Huanpeng Chu, Wei Wu, Guanyu Feng et al.
ICCV 2025posterarXiv:2508.16212
6
citations
Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models
Wei Suo, Ji Ma, Mengyang Sun et al.
ICCV 2025posterarXiv:2412.06458
1
citations
Reasoning Planning for Language Models
Ngoc Bao Nguyen, Trung Hieu Nguyen, Ruifeng She et al.
NeurIPS 2025spotlightarXiv:2511.00521
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training
Felix Krause, Timy Phan, Ming Gui et al.
ICCV 2025posterarXiv:2501.04765
10
citations
Accelerating PDE Data Generation via Differential Operator Action in Solution Space
huanshuo dong, Hong Wang, Haoyang Liu et al.
ICML 2024posterarXiv:2402.05957
Online Cascade Learning for Efficient Inference over Streams
Lunyiu Nie, Zhimin Ding, Erdong Hu et al.
ICML 2024posterarXiv:2402.04513
Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints
Xiaobo Xia, Jiale Liu, Shaokun Zhang et al.
ICML 2024spotlightarXiv:2311.08675