"sequence modeling" Papers

32 papers found

BlockScan: Detecting Anomalies in Blockchain Transactions

Jiahao Yu, Xian Wu, Hao Liu et al.

NeurIPS 2025posterarXiv:2410.04039
3
citations

Competition Dynamics Shape Algorithmic Phases of In-Context Learning

Core Francisco Park, Ekdeep Singh Lubana, Hidenori Tanaka

ICLR 2025posterarXiv:2412.01003
34
citations

Controllable Generation via Locally Constrained Resampling

Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck

ICLR 2025posterarXiv:2410.13111
9
citations

Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient

Wenlong Wang, Ivana Dusparic, Yucheng Shi et al.

ICLR 2025posterarXiv:2410.08893
3
citations

EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling

Jia-Hua Lee, Bor-Jiun Lin, Wei-Fang Sun et al.

NeurIPS 2025spotlightarXiv:2502.00466
2
citations

Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting

Jiahui Zhang, Zhengyang Zhou, Wenjie Du et al.

NeurIPS 2025poster

Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models

Yasha Ektefaie, Andrew Shen, Lavik Jain et al.

NeurIPS 2025poster

Learning Video-Conditioned Policy on Unlabelled Data with Joint Embedding Predictive Transformer

Hao Luo, Zongqing Lu

ICLR 2025poster

Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory

Nikola Zubic, Federico Soldà, Aurelio Sulser et al.

ICLR 2025posterarXiv:2405.16674
17
citations

Neural Attention Search

Difan Deng, Marius Lindauer

NeurIPS 2025posterarXiv:2502.13251

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025posterarXiv:2501.12381
3
citations

Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory

Svetha Venkatesh, Kien Do, Hung Le et al.

ICLR 2025poster

Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Mónika Farsang, Radu Grosu

NeurIPS 2025posterarXiv:2505.21717
4
citations

Selective induction Heads: How Transformers Select Causal Structures in Context

Francesco D'Angelo, francesco croce, Nicolas Flammarion

ICLR 2025posterarXiv:2509.08184
4
citations

State Space Models are Provably Comparable to Transformers in Dynamic Token Selection

Naoki Nishikawa, Taiji Suzuki

ICLR 2025posterarXiv:2405.19036
6
citations

Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models

Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.

NeurIPS 2025spotlightarXiv:2505.17761
6
citations

Unsupervised Meta-Learning via In-Context Learning

Anna Vettoruzzo, Lorenzo Braccaioli, Joaquin Vanschoren et al.

ICLR 2025posterarXiv:2405.16124
3
citations

What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains

Chanakya Ekbote, Ashok Vardhan Makkuva, Marco Bondaschi et al.

NeurIPS 2025spotlightarXiv:2508.07208

ZeroS: Zero‑Sum Linear Attention for Efficient Transformers

Jiecheng Lu, Xu Han, Yan Sun et al.

NeurIPS 2025spotlight

An Information-Theoretic Analysis of In-Context Learning

Hong Jun Jeon, Jason Lee, Qi Lei et al.

ICML 2024posterarXiv:2401.15530

Efficient World Models with Context-Aware Tokenization

Vincent Micheli, Eloi Alonso, François Fleuret

ICML 2024posterarXiv:2406.19320

From Generalization Analysis to Optimization Designs for State Space Models

Fusheng Liu, Qianxiao Li

ICML 2024oral

How Transformers Learn Causal Structure with Gradient Descent

Eshaan Nichani, Alex Damian, Jason Lee

ICML 2024posterarXiv:2402.14735

Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning

Zeyang Liu, Lipeng Wan, Xinrui Yang et al.

AAAI 2024paperarXiv:2402.17978

Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data

Shufan Li, Aditya Grover, Harkanwar Singh

ECCV 2024posterarXiv:2402.05892
103
citations

Reinformer: Max-Return Sequence Modeling for Offline RL

Zifeng Zhuang, Dengyun Peng, Jinxin Liu et al.

ICML 2024posterarXiv:2405.08740

Repeat After Me: Transformers are Better than State Space Models at Copying

Samy Jelassi, David Brandfonbrener, Sham Kakade et al.

ICML 2024posterarXiv:2402.01032

Rethinking Decision Transformer via Hierarchical Reinforcement Learning

Yi Ma, Jianye Hao, Hebin Liang et al.

ICML 2024poster

Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

Ziyin Zhang, Ning Lu, Minghui Liao et al.

AAAI 2024paperarXiv:2308.08806
20
citations

Timer: Generative Pre-trained Transformers Are Large Time Series Models

Yong Liu, Haoran Zhang, Chenyu Li et al.

ICML 2024posterarXiv:2402.02368

Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues

Antonio Orvieto, Soham De, Caglar Gulcehre et al.

ICML 2024poster

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

Yongqiang Cai

ICML 2024spotlightarXiv:2305.12205