2025 "computational cost reduction" Papers
8 papers found
Attribution-Driven Adaptive Token Pruning for Transformers
YAOYAO YAN, Hui Yu, Weizhi Xu
NEURIPS 2025poster
Diffusion on Demand: Selective Caching and Modulation for Efficient Generation
Hee Min Choi, Hyoa Kang, Dokwan Oh et al.
NEURIPS 2025poster
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.
ICLR 2025posterarXiv:2502.13595
74
citations
OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models
Huanpeng Chu, Wei Wu, Guanyu Feng et al.
ICCV 2025posterarXiv:2508.16212
6
citations
Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models
Wei Suo, Ji Ma, Mengyang Sun et al.
ICCV 2025posterarXiv:2412.06458
1
citations
Reasoning Planning for Language Models
Ngoc Bao Nguyen, Trung Hieu Nguyen, Ruifeng She et al.
NEURIPS 2025spotlightarXiv:2511.00521
SMRS: advocating a unified reporting standard for surrogate models in the artificial intelligence era.
Elizaveta Semenova, Siobhan Mackenzie Hall, Timothy James Hitge et al.
NEURIPS 2025posterarXiv:2502.06753
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training
Felix Krause, Timy Phan, Ming Gui et al.
ICCV 2025posterarXiv:2501.04765
10
citations