NEURIPS "long-context modeling" Papers
5 papers found
Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
Xiang Hu, Jiaqi Leng, Jun Zhao et al.
NEURIPS 2025posterarXiv:2504.16795
2
citations
Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation
Szymon Płotka, Gizem Mert, Maciej Chrabaszcz et al.
NEURIPS 2025posterarXiv:2507.06363
1
citations
Rope to Nope and Back Again: A New Hybrid Attention Strategy
Bowen Yang, Bharat Venkitesh, Dwaraknath Gnaneshwar Talupuru et al.
NEURIPS 2025posterarXiv:2501.18795
20
citations
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
Tongyao Zhu, Qian Liu, Haonan Wang et al.
NEURIPS 2025posterarXiv:2503.15450
2
citations
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.
NEURIPS 2025posterarXiv:2503.14376
8
citations