🧬Video Analysis

Object Tracking

Tracking objects across video frames

250 papers(showing top 100)3,210 total citations
Compare with other topics
Mar '24 β€” Feb '26215 papers

Related Topics (Video Analysis)

Also includes: object tracking, visual tracking, video tracking, multi-object tracking, mot

Top Papers

#1

CoTracker: It is Better to Track Together

Nikita Karaev, Ignacio Rocco, Ben Graham et al.

ECCV 2024arXiv:2307.07635
point trackingtransformer-based modeljoint trackingocclusion handling+4
449
citations
#2

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Bowen Wen, Wei Yang, Jan Kautz et al.

CVPR 2024arXiv:2312.08344
412
citations
#3

CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos

Nikita Karaev, Iurii Makarov, Jianyuan Wang et al.

ICCV 2025arXiv:2410.11831
211
citations
#4

Putting the Object Back into Video Object Segmentation

Ho Kei Cheng, Seoung Wug Oh, Brian Price et al.

CVPR 2024arXiv:2310.12982
182
citations
#5

ODTrack: Online Dense Temporal Token Learning for Visual Tracking

Yaozong Zheng, Bineng Zhong, Qihua Liang et al.

AAAI 2024arXiv:2401.01686
visual trackingonline trackingtemporal token learningtoken propagation+3
173
citations
#6

OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Lingyi Hong, Shilin Yan, Renrui Zhang et al.

CVPR 2024arXiv:2403.09634
118
citations
#7

UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation

Kefu Yi, Kai Luo, Xiaolei Luo et al.

AAAI 2024arXiv:2312.08952
multi-object trackingcamera motion compensationkalman filterhomography projection+4
97
citations
#8

HIPTrack: Visual Tracking with Historical Prompts

Wenrui Cai, Qingjie Liu, Yunhong Wang

CVPR 2024arXiv:2311.02072
96
citations
#9

Single-Model and Any-Modality for Video Object Tracking

Zongwei Wu, Jilai Zheng, Xiangxuan Ren et al.

CVPR 2024arXiv:2311.15851
96
citations
#10

Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline

Xiao Wang, Shiao Wang, Chuanming Tang et al.

CVPR 2024arXiv:2309.14611
82
citations
#11

Temporal Adaptive RGBT Tracking with Modality Prompt

Hongyu Wang, Xiaotao Liu, Yifan Li et al.

AAAI 2024arXiv:2401.01244
rgbt trackingmodality promptspatio-temporal interactiononline template update+4
71
citations
#12

DiffusionTrack: Diffusion Model for Multi-Object Tracking

Run Luo, Zikai Song, Lintao Ma et al.

AAAI 2024arXiv:2308.09905
multi-object trackingdenoising diffusion processtracking-by-detectionjoint detection and tracking+3
65
citations
#13

DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction

Weiyi Lv, Yuhang Huang, NING Zhang et al.

CVPR 2024arXiv:2403.02075
59
citations
#14

VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection

Songhao Han, Wei Huang, Hairong Shi et al.

CVPR 2025
54
citations
#15

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames

Shuming Liu, Chenlin Zhang, Chen Zhao et al.

CVPR 2024arXiv:2311.17241
51
citations
#16

M-LLM Based Video Frame Selection for Efficient Video Understanding

Kai Hu, Feng Gao, Xiaohan Nie et al.

CVPR 2025
46
citations
#17

LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry

Weirong Chen, Le Chen, Rui Wang et al.

CVPR 2024arXiv:2401.01887
44
citations
#18

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Yuqian Yuan, Hang Zhang, Wentong Li et al.

CVPR 2025arXiv:2501.00599
40
citations
#19

Scene Adaptive Sparse Transformer for Event-based Object Detection

Yansong Peng, Li Hebei, Yueyi Zhang et al.

CVPR 2024arXiv:2404.01882
40
citations
#20

Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking

Xiantao Hu, Ying Tai, Xu Zhao et al.

AAAI 2025arXiv:2412.15691
38
citations
#21

SUTrack: Towards Simple and Unified Single Object Tracking

Xin Chen, Ben Kang, Wanting Geng et al.

AAAI 2025arXiv:2412.19138
37
citations
#22

TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

Mark YU, Wenbo Hu, Jinbo Xing et al.

ICCV 2025arXiv:2503.05638
35
citations
#23

ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association

Shuxiao Ding, Lukas Schneider, Marius Cordts et al.

CVPR 2024arXiv:2405.08909
34
citations
#24

Towards Generalizable Multi-Object Tracking

Zheng Qin, Le Wang, Sanping Zhou et al.

CVPR 2024arXiv:2406.00429
32
citations
#25

REACTO: Reconstructing Articulated Objects from a Single Video

Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo et al.

CVPR 2024arXiv:2404.11151
32
citations
#26

LEOD: Label-Efficient Object Detection for Event Cameras

Ziyi Wu, Mathias Gehrig, Qing Lyu et al.

CVPR 2024arXiv:2311.17286
30
citations
#27

Exploring Enhanced Contextual Information for Video-Level Object Tracking

Ben Kang, Xin Chen, Simiao Lai et al.

AAAI 2025arXiv:2412.11023
27
citations
#28

Sparse Global Matching for Video Frame Interpolation with Large Motion

Chunxu Liu, Guozhen Zhang, Rui Zhao et al.

CVPR 2024arXiv:2404.06913
27
citations
#29

Trackastra: Transformer-based cell tracking for live-cell microscopy

Benjamin Gallusser, Weigert Martin

ECCV 2024
26
citations
#30

Multi-Object Tracking in the Dark

Xinzhe Wang, Kang Ma, Qiankun Liu et al.

CVPR 2024arXiv:2405.06600
25
citations
#31

MotionFollower: Editing Video Motion via Score-Guided Diffusion

Shuyuan Tu, Qi Dai, Zihao Zhang et al.

ICCV 2025
22
citations
#32

Self-Supervised Multi-Object Tracking with Path Consistency

Zijia Lu, Bing Shuai, Yanbei Chen et al.

CVPR 2024arXiv:2404.05136
21
citations
#33

HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations

Peng Dai, Yang Zhang, Tao Liu et al.

CVPR 2024arXiv:2403.03561
21
citations
#34

Guided Slot Attention for Unsupervised Video Object Segmentation

Minhyeok Lee, Suhwan Cho, Dogyoon Lee et al.

CVPR 2024arXiv:2303.08314
21
citations
#35

Learning to Predict Activity Progress by Self-Supervised Video Alignment

Gerard Donahue, Ehsan Elhamifar

CVPR 2024
20
citations
#36

FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking

Seokju Cho, Gabriel Huang, Seungryong Kim et al.

CVPR 2024
20
citations
#37

DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World Videos

Arjun Balasingam, Joseph Chandler, Chenning Li et al.

CVPR 2024arXiv:2312.09523
18
citations
#38

Adapting Short-Term Transformers for Action Detection in Untrimmed Videos

Min Yang, gaohuan, Ping Guo et al.

CVPR 2024arXiv:2312.01897
17
citations
#39

Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes

Gaurav Shrivastava, Abhinav Shrivastava

CVPR 2024
16
citations
#40

Track-On: Transformer-based Online Point Tracking with Memory

GΓΆrkay Aydemir, Xiongyi Cai, Weidi Xie et al.

ICLR 2025arXiv:2501.18487
point trackinglong-term trackingonline trackingtransformer-based model+3
16
citations
#41

AllTracker: Efficient Dense Point Tracking at High Resolution

Adam Harley, Yang You, Yang Zheng et al.

ICCV 2025
15
citations
#42

Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking

Jiawen Zhu, Huayi Tang, Xin Chen et al.

AAAI 2025arXiv:2503.00516
15
citations
#43

3D Multi-frame Fusion for Video Stabilization

Zhan Peng, Xinyi Ye, Weiyue Zhao et al.

CVPR 2024arXiv:2404.12887
13
citations
#44

Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking

Yan Gao, Haojun Xu, Jie Li et al.

AAAI 2024arXiv:2312.08951
multiple object trackinggraph-based trackingtrajectory associationgraph neural network+3
13
citations
#45

M3SOT: Multi-Frame, Multi-Field, Multi-Space 3D Single Object Tracking

Jiaming Liu, Yue Wu, Maoguo Gong et al.

AAAI 2024arXiv:2312.06117
3d single object trackingpoint cloud processingtransformer-based networkmulti-frame tracking+3
12
citations
#46

MDP3: A Training-free Approach for List-wise Frame Selection in Video-LLMs

Hui Sun, Shiyin Lu, Huanyu Wang et al.

ICCV 2025
12
citations
#47

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation

Guy Yariv, Yuval Kirstain, Amit Zohar et al.

CVPR 2025arXiv:2501.03059
12
citations
#48

Semantic and Sequential Alignment for Referring Video Object Segmentation

Feiyu Pan, Hao Fang, Fangkai Li et al.

CVPR 2025
11
citations
#49

Instance Tracking in 3D Scenes from Egocentric Videos

Yunhan Zhao, Haoyu Ma, Shu Kong et al.

CVPR 2024arXiv:2312.04117
11
citations
#50

XTrack: Multimodal Training Boosts RGB-X Video Object Trackers

Yuedong Tan, Zongwei Wu, Yuqian Fu et al.

ICCV 2025arXiv:2405.17773
10
citations
#51

ObjectMover: Generative Object Movement with Video Prior

Xin Yu, Tianyu Wang, Soo Ye Kim et al.

CVPR 2025arXiv:2503.08037
object movementvideo generation modellighting harmonizationimage editing+3
10
citations
#52

Emergent Temporal Correspondences from Video Diffusion Transformers

Jisu Nam, Soowon Son, Dahyun Chung et al.

NeurIPS 2025arXiv:2506.17220
10
citations
#53

MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking

Haolin Qin, Tingfa Xu, Tianhao Li et al.

CVPR 2025
9
citations
#54

Focusing on Tracks for Online Multi-Object Tracking

Kyujin Shim, Kangwook Ko, YuJin Yang et al.

CVPR 2025
8
citations
#55

SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking

Siyuan Li, Lei Ke, Yung-Hsu Yang et al.

ECCV 2024arXiv:2409.11235
8
citations
#56

Exploring Historical Information for RGBE Visual Tracking with Mamba

Chuanyu Sun, Jiqing Zhang, Yang Wang et al.

CVPR 2025
7
citations
#57

4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison

CVPR 2025arXiv:2505.22859
6
citations
#58

OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer

Jinyang Li, En Yu, Sijia Chen et al.

ICLR 2025arXiv:2503.10616
6
citations
#59

6D Object Pose Tracking in Internet Videos for Robotic Manipulation

Georgy Ponimatkin, Martin CΓ­fka, Tomas Soucek et al.

ICLR 2025
6
citations
#60

Dual Conditioned Motion Diffusion for Pose-Based Video Anomaly Detection

Hongsong Wang, Andi Xu, Pinle Ding et al.

AAAI 2025arXiv:2412.17210
6
citations
#61

Recognizing Ultra-High-Speed Moving Objects with Bio-Inspired Spike Camera

Junwei Zhao, Shiliang Zhang, Zhaofei Yu et al.

AAAI 2024
5
citations
#62

A Unified Framework for Human-centric Point Cloud Video Understanding

Yiteng Xu, Kecheng Ye, xiao han et al.

CVPR 2024arXiv:2403.20031
5
citations
#63

Omnidirectional Multi-Object Tracking

Kai Luo, Hao Shi, Sheng Wu et al.

CVPR 2025
5
citations
#64

Fine-grained Spatiotemporal Grounding on Egocentric Videos

Shuo LIANG, Yiwu Zhong, Zi-Yuan Hu et al.

ICCV 2025arXiv:2508.00518
spatiotemporal video groundingegocentric video understandingpixel-level benchmarkautomatic annotation pipeline+4
5
citations
#65

Projecting Trackable Thermal Patterns for Dynamic Computer Vision

Mark Sheinin, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan

CVPR 2024
5
citations
#66

Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance

Sanchayan Santra, Vishal Chudasama, Pankaj Wasnik et al.

CVPR 2025
4
citations
#67

Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow

Ruyang Liu, Shangkun Sun, Haoran Tang et al.

ICCV 2025
3
citations
#68

TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion

Haoyue Liu, Jinghan Xu, Yi Chang et al.

CVPR 2025arXiv:2505.03116
video frame interpolationevent camerasnon-linear motioncontinuous point tracking+4
3
citations
#69

Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Object Appearance Graphs

Mattia Segu, Luigi Piccinelli, Siyuan Li et al.

ECCV 2024
3
citations
#70

PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

longbin ji, Lei Zhong, Pengfei Wei et al.

CVPR 2025
3
citations
#71

VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking

Jens Hellekes, Manuel MΓΌhlhaus, Reza Bahmanyar et al.

ECCV 2024
vehicle trackingaerial imagerymulti-object trackingmoving camera scenarios+4
3
citations
#72

TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation

Zonglin Lyu, Chen Chen

ICCV 2025arXiv:2507.04984
video frame interpolationdiffusion modelslatent brownian bridgetemporal-aware autoencoder+3
3
citations
#73

What You Have is What You Track: Adaptive and Robust Multimodal Tracking

Yuedong Tan, Jiawei Shao, Eduard Zamfir et al.

ICCV 2025arXiv:2507.05899
3
citations
#74

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better

Zihang Lai, Andrea Vedaldi

CVPR 2025arXiv:2503.19904
3
citations
#75

Video Individual Counting for Moving Drones

Yaowu Fan, Jia Wan, Tao Han et al.

ICCV 2025
3
citations
#76

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Yuhong Zhang, Guanlin Wu, Ling-Hao Chen et al.

CVPR 2025arXiv:2503.07597
3
citations
#77

Event2Tracking: Reconstructing Multi-Agent Soccer Trajectories Using Long-Term Multimodal Context

Harry Hughes, Michael Horton, Xinyu Wei et al.

AAAI 2025
3
citations
#78

Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker

Xinyu Xiang, Qinglong Yan, Hao Zhang et al.

AAAI 2025
3
citations
#79

Exploiting Continuous Motion Clues for Vision-Based Occupancy Prediction

Haoran Xu, Peixi Peng, Xinyi Zhang et al.

AAAI 2025
2
citations
#80

TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos

Jinxi Li, Ziyang Song, Bo Yang

ICCV 2025arXiv:2508.09811
2
citations
#81

Hand-held Object Reconstruction from RGB Video with Dynamic Interaction

Shijian Jiang, Qi Ye, Rengan Xie et al.

CVPR 2025
2
citations
#82

BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions

Wonyong Seo, Jihyong Oh, Munchurl Kim

CVPR 2025arXiv:2412.11365
2
citations
#83

Track Any Anomalous Object:A Granular Video Anomaly Detection Pipeline

Yuzhi Huang, Chenxin Li, Haitao Zhang et al.

CVPR 2025
2
citations
#84

Everything is a Video: Unifying Modalities through Next-Frame Prediction

G Thomas Hudson, Dean Slack, Thomas Winterbottom et al.

ICCV 2025
2
citations
#85

PvNeXt: Rethinking Network Design and Temporal Motion for Point Cloud Video Recognition

Jie Wang, Tingfa Xu, Lihe Ding et al.

ICLR 2025arXiv:2504.05075
2
citations
#86

GLOMA: Global Video Text Spotting with Morphological Association

Han Wang, Yanjie Wang, Yang Li et al.

ICLR 2025
2
citations
#87

S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking

Tao Tang, Lijun Zhou, Pengkun Hao et al.

ICML 2025arXiv:2406.02147
2
citations
#88

Efficient Motion Prompt Learning for Robust Visual Tracking

Jie Zhao, Xin Chen, Yongsheng Yuan et al.

ICML 2025arXiv:2505.16321
1
citations
#89

TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels

Jiahao Lu, Weitao Xiong, Jiacheng Deng et al.

NeurIPS 2025arXiv:2512.08358
monocular 3d trackingdense 2d trackingworld-centric coordinate systemcamera pose estimation+3
1
citations
#90

Is This Tracker On? A Benchmark Protocol for Dynamic Tracking

Ilona Demler, Saumya Chauhan, Georgia Gkioxari

NeurIPS 2025
1
citations
#91

Motion-Zero: A Zero-Shot Trajectory Control Framework of Moving Object for Diffusion-Based Video Generation

Changgu Chen, Junwei Shu, Gaoqi He et al.

AAAI 2025
1
citations
#92

FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video

Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado et al.

CVPR 2025arXiv:2503.23094
1
citations
#93

Multi-View 3D Point Tracking

Frano Rajič, Haofei Xu, Marko Mihajlovic et al.

ICCV 2025arXiv:2508.21060
1
citations
#94

TAPTR: Tracking Any Point with Transformers as Detection

Hongyang Li, Hao Zhang, Shilong Liu et al.

ECCV 2024
β€”
not collected
#95

DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking

Mingzhe Guo, Weiping Tan, Wenyu Ran et al.

CVPR 2025
β€”
not collected
#96

Spatial-Temporal Multi-level Association for Video Object Segmentation

Deshui Miao, Xin Li, Zhenyu He et al.

ECCV 2024
β€”
not collected
#97

Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation

Hyeonho Jeong, Chun-Hao P. Huang, Jong Chul Ye et al.

CVPR 2025arXiv:2412.06016
β€”
not collected
#98

Local All-Pair Correspondence for Point Tracking

Seokju Cho, Jiahui Huang, Jisu Nam et al.

ECCV 2024
β€”
not collected
#99

VideoOrion: Tokenizing Object Dynamics in Videos

Yicheng Feng, Yijiang Li, Wanpeng Zhang et al.

ICCV 2025arXiv:2411.16156
object dynamicsvideo large language modelsobject tokenizationspatial-temporal feature aggregation+2
β€”
not collected
#100

COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion

Zekun Qian, Ruize Han, Zhixiang Wang et al.

ICCV 2025
β€”
not collected