NEURIPS "sparse attention" Papers

13 papers found

DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference

Chong Wu, Jiawang Cao, Renjie Xu et al.

NEURIPS 2025poster

Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access

Xiang Hu, Jiaqi Leng, Jun Zhao et al.

NEURIPS 2025posterarXiv:2504.16795
2
citations

Inference-Time Hyper-Scaling with KV Cache Compression

Adrian Łańcucki, Konrad Staniszewski, Piotr Nawrot et al.

NEURIPS 2025posterarXiv:2506.05345
14
citations

Kinetics: Rethinking Test-Time Scaling Law

Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng et al.

NEURIPS 2025posterarXiv:2506.05333
7
citations

MoBA: Mixture of Block Attention for Long-Context LLMs

Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.

NEURIPS 2025spotlightarXiv:2502.13189
94
citations

Overcoming Long Context Limitations of State Space Models via Context Dependent Sparse Attention

Zhihao Zhan, Jianan Zhao, Zhaocheng Zhu et al.

NEURIPS 2025poster

Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape

Ruichen Chen, Keith Mills, Liyao Jiang et al.

NEURIPS 2025oralarXiv:2505.22918
1
citations

SALS: Sparse Attention in Latent Space for KV Cache Compression

Junlin Mu, Hantao Huang, Jihang Zhang et al.

NEURIPS 2025posterarXiv:2510.24273

Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention

Chong You, Kan Wu, Zhipeng Jia et al.

NEURIPS 2025poster
2
citations

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Shuo Yang, Haocheng Xi, Yilong Zhao et al.

NEURIPS 2025spotlightarXiv:2505.18875
31
citations

The emergence of sparse attention: impact of data distribution and benefits of repetition

Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.

NEURIPS 2025oralarXiv:2505.17863
6
citations

Transformers Learn Faster with Semantic Focus

Parikshit Ram, Kenneth Clarkson, Tim Klinger et al.

NEURIPS 2025posterarXiv:2506.14095

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

Chaofan Lin, Jiaming Tang, Shuo Yang et al.

NEURIPS 2025spotlightarXiv:2502.02770
12
citations