🧬Architectures

State Space Models

SSMs including Mamba architecture

100 papers4,030 total citations
Compare with other topics
Mar '24 Feb '26400 papers
Also includes: state space models, ssm, mamba, s4, linear attention, structured ssm

Top Papers

#1

WorldSimBench: Towards Video Generation Models as World Simulators

Yiran Qin, Zhelun Shi, Jiwen Yu et al.

ICML 2025
806
citations
#2

VideoMamba: State Space Model for Efficient Video Understanding

Kunchang Li, Xinhao Li, Yi Wang et al.

ECCV 2024arXiv:2403.06977
state space modelsvideo understandinglong-term modelingefficient video processing+4
401
citations
#3

Why Do Multi-Agent LLM Systems Fail?

Mert Cemri, Melissa Z Pan, Shuyi Yang et al.

NeurIPS 2025arXiv:2503.13657
multi-agent llm systemsfailure pattern analysissystem failure taxonomyllm-as-a-judge+3
188
citations
#4

ZigMa: A DiT-style Zigzag Mamba Diffusion Model

Tao Hu, Stefan Andreas Baumann, Ming Gui et al.

ECCV 2024
188
citations
#5

MambaOut: Do We Really Need Mamba for Vision?

Weihao Yu, Xinchao Wang

CVPR 2025arXiv:2405.07992
state space modelattention mechanismimage classificationobject detection+4
186
citations
#6

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

Mingrui Li, Shuhong Liu, Heng Zhou et al.

ECCV 2024arXiv:2402.03246
gaussian splattingvisual slamsemantic segmentationneural implicit slam+4
131
citations
#7

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Liliang Ren, Yang Liu, Yadong Lu et al.

ICLR 2025
115
citations
#8

IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection

Mingjin Zhang, Yuchun Wang, Jie Guo et al.

ECCV 2024arXiv:2407.07520
infrared small target detectionsegment anything modelthermal image segmentationperona-malik diffusion+4
110
citations
#9

Motion Mamba: Efficient and Long Sequence Motion Generation

Zeyu Zhang, Akide Liu, Ian Reid et al.

ECCV 2024arXiv:2403.07487
state space modelsmotion generationlong sequence modelinghuman motion generation+4
108
citations
#10

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation

Guanxing Lu, Shiyi Zhang, Ziwei Wang et al.

ECCV 2024
106
citations
#11

Agent S: An Open Agentic Framework that Uses Computers Like a Human

Saaket Agashe, Jiuzhou Han, Shuyu Gan et al.

ICLR 2025
100
citations
#12

MambaIRv2: Attentive State Space Restoration

Hang Guo, Yong Guo, Yaohua Zha et al.

CVPR 2025arXiv:2411.15269
image restorationstate space modelsnon-causal modelingattention mechanism+4
82
citations
#13

Point Cloud Mamba: Point Cloud Learning via State Space Model

Tao Zhang, Haobo Yuan, Lu Qi et al.

AAAI 2025
81
citations
#14

UMA: A Family of Universal Models for Atoms

Brandon Wood, Misko Dzamba, Xiang Fu et al.

NeurIPS 2025arXiv:2506.23971
atomic simulationsmaterials sciencemixture of linear expertsempirical scaling laws+4
62
citations
#15

Hymba: A Hybrid-head Architecture for Small Language Models

Xin Dong, Yonggan Fu, Shizhe Diao et al.

ICLR 2025arXiv:2411.13676
small language modelshybrid-head architecturetransformer attention mechanismsstate space models+3
55
citations
#16

SubT-MRS Dataset: Pushing SLAM Towards All-weather Environments

Shibo Zhao, Yuanjun Gao, Tianhao Wu et al.

CVPR 2024
49
citations
#17

ReMamber: Referring Image Segmentation with Mamba Twister

Yuhuan Yang, Chaofan Ma, Jiangchao Yao et al.

ECCV 2024
49
citations
#18

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Han Liang, Jiacheng Bao, Ruichi Zhang et al.

CVPR 2024
47
citations
#19

BAMM: Bidirectional Autoregressive Motion Model

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang et al.

ECCV 2024arXiv:2403.19435
text-to-motion generationautoregressive motion modelsmotion tokenizermasked self-attention transformer+4
42
citations
#20

Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking

Heli Ben-Hamu, Itai Gat, Daniel Severo et al.

NeurIPS 2025arXiv:2505.24857
masked diffusion modelsaccelerated samplingentropy bounded unmaskinglanguage modeling+3
40
citations
#21

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

Han Shu, Wenshuo Li, Yehui Tang et al.

AAAI 2025
37
citations
#22

STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding?

Yun Li, Yiming Zhang, Tao Lin et al.

ICCV 2025
36
citations
#23

SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM Optimization

Zhenlong Yuan, Jiakai Cao, Zhaoxin Li et al.

AAAI 2024arXiv:2401.06385
multi-view stereo3d reconstructiontextureless areassegment anything model+4
35
citations
#24

WISA: World simulator assistant for physics-aware text-to-video generation

Jing Wang, Ao Ma, Ke Cao et al.

NeurIPS 2025arXiv:2503.08153
text-to-video generationphysics-aware generationworld simulatorsphysical principles decomposition+3
34
citations
#25

Scaling Wearable Foundation Models

Girish Narayanswamy, Xin Liu, Kumar Ayush et al.

ICLR 2025
33
citations
#26

System 1.x: Learning to Balance Fast and Slow Planning with Language Models

Swarnadeep Saha, Archiki Prasad, Justin Chen et al.

ICLR 2025
31
citations
#27

Longhorn: State Space Models are Amortized Online Learners

Bo Liu, Rui Wang, Lemeng Wu et al.

ICLR 2025
29
citations
#28

WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments

Jianhao Zheng, Zihan Zhu, Valentin Bieri et al.

CVPR 2025
29
citations
#29

Fast-in-Slow: A Dual-System VLA Model Unifying Fast Manipulation within Slow Reasoning

Hao Chen, Jiaming Liu, Chenyang Gu et al.

NeurIPS 2025
robotic manipulationvision-language-action modeldual-system architectureparameter sharing+4
27
citations
#30

EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality

Sanghyeok Lee, Joonmyung Choi, Hyunwoo J. Kim

CVPR 2025
25
citations
#31

MUSE-VL: Modeling Unified VLM through Semantic Discrete Encoding

Rongchang Xie, Chen Du, Ping Song et al.

ICCV 2025arXiv:2411.17762
vision-language modelssemantic discrete encodingmultimodal understandingvisual generation+3
25
citations
#32

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders

Baijiong Lin, Weisen Jiang, Pengguang Chen et al.

ECCV 2024
25
citations
#33

VSSD: Vision Mamba with Non-Causal State Space Duality

Yuheng Shi, Mingjia Li, Minjing Dong et al.

ICCV 2025arXiv:2407.18559
state space modelsvision transformersnon-causal modelingimage classification+4
24
citations
#34

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning

Duojun Huang, Xinyu Xiong, Jie Ma et al.

CVPR 2024
24
citations
#35

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024
23
citations
#36

Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors

Weixuan Wang, JINGYUAN YANG, Wei Peng

ICLR 2025
23
citations
#37

G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems

Guibin Zhang, Muxin Fu, Kun Wang et al.

NeurIPS 2025
22
citations
#38

Robust Tracking via Mamba-based Context-aware Token Learning

Jinxia Xie, Bineng Zhong, Qihua Liang et al.

AAAI 2025
22
citations
#39

Oscillatory State-Space Models

T. Konstantin Rusch, Daniela Rus

ICLR 2025arXiv:2410.03943
state-space modelsharmonic oscillatorslong sequencestime-series forecasting+4
21
citations
#40

2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification

Jingwei Zhang, Anh Tien Nguyen, Xi Han et al.

CVPR 2025arXiv:2412.00678
state space modelsimage representationwhole slide imagingcomputational efficiency+4
20
citations
#41

OccMamba: Semantic Occupancy Prediction with State Space Models

Heng Li, Yuenan Hou, Xiaohan Xing et al.

CVPR 2025
19
citations
#42

QMambaBSR: Burst Image Super-Resolution with Query State Space Model

Xin Di, Long Peng, Peizhe Xia et al.

CVPR 2025
19
citations
#43

TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets

Yuzhe YANG, Yifei Zhang, Minghao Wu et al.

NeurIPS 2025arXiv:2502.01506
multi-agent simulationbehavioral economicslarge language model agentssocial emergence+4
19
citations
#44

SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models

Shuaijie Shen, Chao Wang, Renzhuo Huang et al.

AAAI 2025
18
citations
#45

Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment

Yongxu Liu, Yinghui Quan, Guoyao Xiao et al.

AAAI 2024arXiv:2401.02614
image quality assessmentvideo quality assessmentdata sampling methodsmulti-scale representation+4
17
citations
#46

Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures

Junxuan Wang, Xuyang Ge, Wentao Shu et al.

ICLR 2025
17
citations
#47

Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory

Nikola Zubic, Federico Soldà, Aurelio Sulser et al.

ICLR 2025arXiv:2405.16674
sequence modelingstructured state space modelsfunction compositioncomputational complexity theory+4
17
citations
#48

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

Yan Li, Yifei Xing, Xiangyuan Lan et al.

CVPR 2025arXiv:2412.00833
multimodal fusioncross-modal alignmentmamba modelsoptimal transport+3
17
citations
#49

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

Shangbin Feng, Zifeng Wang, Yike Wang et al.

ICML 2025
16
citations
#50

Quamba: A Post-Training Quantization Recipe for Selective State Space Models

Hung-Yueh Chiang, Chi-Chih Chang, Natalia Frumkin et al.

ICLR 2025
16
citations
#51

MambaIC: State Space Models for High-Performance Learned Image Compression

Fanhu Zeng, Hao Tang, Yihua Shao et al.

CVPR 2025
14
citations
#52

JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba

Xiaoyong Lu, Songlin Du

CVPR 2025arXiv:2503.03437
local feature matchingmamba architecturelinear complexityscan-merge strategy+3
14
citations
#53

Event-based Video Super-Resolution via State Space Models

Zeyu Xiao, Xinchao Wang

CVPR 2025
13
citations
#54

Stable Segment Anything Model

Qi Fan, Xin Tao, Lei Ke et al.

ICLR 2025arXiv:2311.15776
promptable segmentationsegmentation stabilitydeformable samplingmask attention calibration+3
12
citations
#55

Symphony: Symmetry-Equivariant Point-Centered Spherical Harmonics for 3D Molecule Generation

Ameya Daigavane, Song Eun Kim, Mario Geiger et al.

ICLR 2024
11
citations
#56

DG-Mamba: Robust and Efficient Dynamic Graph Structure Learning with Selective State Space Models

Haonan Yuan, Qingyun Sun, Zhaonan Wang et al.

AAAI 2025
11
citations
#57

Efficiently Parameterized Neural Metriplectic Systems

Anthony Gruber, Kookjin Lee, Haksoo Lim et al.

ICLR 2025arXiv:2405.16305
metriplectic systemsenergy conserving systemsentropy stabilitydynamics learning+2
10
citations
#58

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

Fengxiang Wang, Yulin Wang, Mingshuo Chen et al.

NeurIPS 2025arXiv:2503.10392
remote sensing foundation modelsmamba architectureself-supervised learninglinear-complexity models+4
10
citations
#59

Fast training and sampling of Restricted Boltzmann Machines

Nicolas BEREUX, Aurélien Decelle, Cyril Furtlehner et al.

ICLR 2025arXiv:2405.15376
restricted boltzmann machinesmarkov chain monte carloparallel trajectory temperingpartition function computation+4
10
citations
#60

Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports

Yi Xu, Yun Fu

ICLR 2025arXiv:2405.17680
trajectory generationmulti-agent movementtrajectory predictionspatial-temporal recovery+4
10
citations
#61

Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction

Guowei Xu, Jiale Tao, Wen Li et al.

ECCV 2024arXiv:2407.11494
human motion predictionsemantic latent directionsgenerative modelslatent space control+3
9
citations
#62

Motion Diversification Networks

Hee Jae Kim, Eshed Ohn-Bar

CVPR 2024
9
citations
#63

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters

Jianping Jiang, Weiye Xiao, Zhengyu Lin et al.

CVPR 2025
9
citations
#64

Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM

Yizhou Huang, Yihua Cheng, Kezhi Wang

CVPR 2025arXiv:2503.10898
trajectory predictionselective state-space modelautonomous drivingmotion forecasting+4
9
citations
#65

Hyperion – A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM

David Hug, Ignacio Alzugaray Lopez, Margarita Chli

ECCV 2024
8
citations
#66

Distilling Structural Representations into Protein Sequence Models

Jeffrey Ouyang-Zhang, Chengyue Gong, Yue Zhao et al.

ICLR 2025
protein language modelsstructure token generationmutation stability assessmentprotein structure prediction+4
8
citations
#67

PRE-Mamba: A 4D State Space Model for Ultra-High-Frequent Event Camera Deraining

Ciyu Ruan, Ruishan Guo, Zihang GONG et al.

ICCV 2025arXiv:2505.05307
event camera deraining4d state space modelspatiotemporal event representationpoint-based event processing+3
8
citations
#68

Compositional simulation-based inference for time series

Manuel Gloeckler, Shoji Toyota, Kenji Fukumizu et al.

ICLR 2025
8
citations
#69

Sparse Learning for State Space Models on Mobile

Xuan Shen, Hangyu Zheng, Yifan Gong et al.

ICLR 2025
8
citations
#70

LOMA: Language-assisted Semantic Occupancy Network via Triplane Mamba

Yubo Cui, Zhiheng Li, Jiaqiang Wang et al.

AAAI 2025
8
citations
#71

ModeSeq: Taming Sparse Multimodal Motion Prediction with Sequential Mode Modeling

Zikang Zhou, Hengjian Zhou, Haibo Hu et al.

CVPR 2025
7
citations
#72

SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Peishan Cong, Ziyi Wang, Yuexin Ma et al.

CVPR 2025
7
citations
#73

M3amba: Memory Mamba is All You Need for Whole Slide Image Classification

Tingting Zheng, Kui Jiang, Yi Xiao et al.

CVPR 2025
7
citations
#74

Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

Peihao Wang, Ruisi Cai, Yuehao Wang et al.

ICLR 2025
7
citations
#75

S4M: S4 for multivariate time series forecasting with Missing values

Jing Peng, Meiqi Yang, Qiong Zhang et al.

ICLR 2025arXiv:2503.00900
multivariate time series forecastingmissing data handlingstructured state space modelsend-to-end forecasting+4
7
citations
#76

Momentum Multi-Marginal Schrödinger Bridge Matching

Panagiotis Theodoropoulos, Augustinos Saravanos, Evangelos Theodorou et al.

NeurIPS 2025arXiv:2506.10168
schrödinger bridge matchingmulti-marginal optimal controlmeasure-valued splinesstochastic bridges+4
6
citations
#77

Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models

Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.

NeurIPS 2025arXiv:2505.17761
controlled differential equationsstate-transition matricessequence modelingparallel-in-time computation+3
6
citations
#78

MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking

Xinqi Liu, Li Zhou, Zikun Zhou et al.

CVPR 2025
6
citations
#79

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Jingli Lin, Chenming Zhu, Runsen Xu et al.

NeurIPS 2025
6
citations
#80

RadarMOSEVE: A Spatial-Temporal Transformer Network for Radar-Only Moving Object Segmentation and Ego-Velocity Estimation

Changsong Pang, Xieyuanli Chen, Yimin Liu et al.

AAAI 2024arXiv:2402.14380
moving object segmentationego-velocity estimationradar point cloudsspatial-temporal transformer+4
6
citations
#81

Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning

Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan et al.

NeurIPS 2025
6
citations
#82

State Space Models are Provably Comparable to Transformers in Dynamic Token Selection

Naoki Nishikawa, Taiji Suzuki

ICLR 2025arXiv:2405.19036
state space modelssequence modelingdynamic token selectionnonlinear layers+2
6
citations
#83

SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer

Hongda Liu, Longguang Wang, Ye Zhang et al.

CVPR 2025
6
citations
#84

Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models

Fusheng Liu, Qianxiao Li

ICLR 2025arXiv:2411.19455
state space modelsinitialization schemesautocorrelation analysistimescale characterization+3
6
citations
#85

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

Mingju Gao, Yike Pan, Huan-ang Gao et al.

CVPR 2025arXiv:2503.19913
part-level dynamics4d reconstruction frameworkmulti-view images3d gaussian reconstruction+4
6
citations
#86

Parameter-Efficient Fine-Tuning of State Space Models

Kevin Galim, Wonjun Kang, Yuchen Zeng et al.

ICML 2025
6
citations
#87

GroupMamba: Efficient Group-Based Visual State Space Model

Abdelrahman Shaker, Syed Talal Wasim, Salman Khan et al.

CVPR 2025
6
citations
#88

SAE-V: Interpreting Multimodal Models for Enhanced Alignment

Hantao Lou, Changye Li, Jiaming Ji et al.

ICML 2025
6
citations
#89

SEGS-SLAM: Structure-enhanced 3D Gaussian Splatting SLAM with Appearance Embedding

Tianci Wen, Zhiang Liu, Yongchun Fang

ICCV 2025
5
citations
#90

MOSCATO: Predicting Multiple Object State Change Through Actions

Parnian Zameni, Yuhan Shen, Ehsan Elhamifar

ICCV 2025
5
citations
#91

Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting

ChengAo Shen, Wenchao Yu, Ziming Zhao et al.

NeurIPS 2025arXiv:2505.24003
long-term time series forecastingmulti-modal viewstrend-seasonal decompositionlarge vision models+2
5
citations
#92

Learning Safe Action Models with Partial Observability

Hai Le, Brendan Juba, Roni Stern

AAAI 2024
5
citations
#93

Sable: a Performant, Efficient and Scalable Sequence Model for MARL

Omayma Mahjoub, Sasha Abramowitz, Ruan de Kock et al.

ICML 2025
4
citations
#94

ZigzagPointMamba: Spatial-Semantic Mamba for Point Cloud Understanding

LinshuangDiao, Sensen Song, Yurong Qian et al.

NeurIPS 2025
4
citations
#95

SSAN: A Symbol Spatial-Aware Network for Handwritten Mathematical Expression Recognition

Haoran Zhang, Xiangdong Su, Xingxiang Zhou et al.

AAAI 2025
4
citations
#96

OuroMamba: A Data-Free Quantization Framework for Vision Mamba

Akshat Ramachandran, Mingyu Lee, Huan Xu et al.

ICCV 2025arXiv:2503.10959
vision mamba modelsdata-free quantizationpost-training quantizationcontrastive learning+3
4
citations
#97

🎧MOSPA: Human Motion Generation Driven by Spatial Audio

Shuyang Xu, Zhiyang Dou, Mingyi Shi et al.

NeurIPS 2025
4
citations
#98

Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Mónika Farsang, Radu Grosu

NeurIPS 2025arXiv:2505.21717
sequence modelingrecurrent modelsstate-space layersgradient stability+3
4
citations
#99

SBSC: Step-by-Step Coding for Improving Mathematical Olympiad Performance

Kunal Singh, Ankan Biswas, Sayandeep Bhowmick et al.

ICLR 2025
4
citations
#100

Epistemic Monte Carlo Tree Search

Yaniv Oren, Viliam Vadocz, Matthijs T. J. Spaan et al.

ICLR 2025
4
citations