NEURIPS 2025 "sequence modeling" Papers
15 papers found
Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data
Tianyi Chen, Pengxiao Lin, Zhiwei Wang et al.
NEURIPS 2025spotlightarXiv:2509.17514
BlockScan: Detecting Anomalies in Blockchain Transactions
Jiahao Yu, Xian Wu, Hao Liu et al.
NEURIPS 2025posterarXiv:2410.04039
3
citations
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
Julien Siems, Timur Carstensen, Arber Zela et al.
NEURIPS 2025posterarXiv:2502.10297
23
citations
EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling
Jia-Hua Lee, Bor-Jiun Lin, Wei-Fang Sun et al.
NEURIPS 2025spotlightarXiv:2502.00466
2
citations
Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting
Jiahui Zhang, Zhengyang Zhou, Wenjie Du et al.
NEURIPS 2025poster
Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models
Yasha Ektefaie, Andrew Shen, Lavik Jain et al.
NEURIPS 2025poster
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer
NEURIPS 2025posterarXiv:2510.22951
2
citations
Improving Bilinear RNN with Closed-loop Control
Jiaxi Hu, Yongqi Pan, Jusen Du et al.
NEURIPS 2025spotlightarXiv:2506.02475
3
citations
Neural Attention Search
Difan Deng, Marius Lindauer
NEURIPS 2025posterarXiv:2502.13251
Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling
Mónika Farsang, Radu Grosu
NEURIPS 2025posterarXiv:2505.21717
4
citations
SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling
Yizhao Gao, Zhichen Zeng, DaYou Du et al.
NEURIPS 2025poster
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.
NEURIPS 2025spotlightarXiv:2505.17761
6
citations
Tensor Product Attention Is All You Need
Yifan Zhang, Yifeng Liu, Huizhuo Yuan et al.
NEURIPS 2025spotlightarXiv:2501.06425
33
citations
What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
Chanakya Ekbote, Ashok Vardhan Makkuva, Marco Bondaschi et al.
NEURIPS 2025spotlightarXiv:2508.07208
ZeroS: Zero‑Sum Linear Attention for Efficient Transformers
Jiecheng Lu, Xu Han, Yan Sun et al.
NEURIPS 2025spotlight