NEURIPS 2025 "transformer efficiency" Papers
6 papers found
Attribution-Driven Adaptive Token Pruning for Transformers
YAOYAO YAN, Hui Yu, Weizhi Xu
NEURIPS 2025poster
Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
Naoki Nishikawa, Rei Higuchi, Taiji Suzuki
NEURIPS 2025posterarXiv:2507.03340
1
citations
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction
Jeffrey Willette, Heejun Lee, Sung Ju Hwang
NEURIPS 2025posterarXiv:2505.11254
2
citations
FlashBias: Fast Computation of Attention with Bias
Haixu Wu, Minghao Guo, Yuezhou Ma et al.
NEURIPS 2025posterarXiv:2505.12044
1
citations
Fourier Token Merging: Understanding and Capitalizing Frequency Domain for Efficient Image Generation
Jiesong Liu, Xipeng Shen
NEURIPS 2025poster
ZeroS: Zero‑Sum Linear Attention for Efficient Transformers
Jiecheng Lu, Xu Han, Yan Sun et al.
NEURIPS 2025spotlight