NEURIPS 2025 "sparse attention" Papers
10 papers found
DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference
Chong Wu, Jiawang Cao, Renjie Xu et al.
NEURIPS 2025poster
Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
Xiang Hu, Jiaqi Leng, Jun Zhao et al.
NEURIPS 2025posterarXiv:2504.16795
2
citations
Inference-Time Hyper-Scaling with KV Cache Compression
Adrian Łańcucki, Konrad Staniszewski, Piotr Nawrot et al.
NEURIPS 2025posterarXiv:2506.05345
14
citations
Kinetics: Rethinking Test-Time Scaling Law
Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng et al.
NEURIPS 2025posterarXiv:2506.05333
7
citations
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
NEURIPS 2025spotlightarXiv:2502.13189
94
citations
Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape
Ruichen Chen, Keith Mills, Liyao Jiang et al.
NEURIPS 2025oralarXiv:2505.22918
1
citations
SALS: Sparse Attention in Latent Space for KV Cache Compression
Junlin Mu, Hantao Huang, Jihang Zhang et al.
NEURIPS 2025posterarXiv:2510.24273
Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention
Chong You, Kan Wu, Zhipeng Jia et al.
NEURIPS 2025poster
2
citations
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
Shuo Yang, Haocheng Xi, Yilong Zhao et al.
NEURIPS 2025spotlightarXiv:2505.18875
31
citations
Transformers Learn Faster with Semantic Focus
Parikshit Ram, Kenneth Clarkson, Tim Klinger et al.
NEURIPS 2025posterarXiv:2506.14095