NEURIPS Spotlight "sparse attention" Papers
3 papers found
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
NEURIPS 2025spotlightarXiv:2502.13189
94
citations
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
Shuo Yang, Haocheng Xi, Yilong Zhao et al.
NEURIPS 2025spotlightarXiv:2505.18875
31
citations
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
Chaofan Lin, Jiaming Tang, Shuo Yang et al.
NEURIPS 2025spotlightarXiv:2502.02770
12
citations