Object Tracking
Tracking objects across video frames
Related Topics (Video Analysis)
Top Papers
CoTracker: It is Better to Track Together
Nikita Karaev, Ignacio Rocco, Ben Graham et al.
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Bowen Wen, Wei Yang, Jan Kautz et al.
CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos
Nikita Karaev, Iurii Makarov, Jianyuan Wang et al.
Putting the Object Back into Video Object Segmentation
Ho Kei Cheng, Seoung Wug Oh, Brian Price et al.
ODTrack: Online Dense Temporal Token Learning for Visual Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang et al.
OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
Lingyi Hong, Shilin Yan, Renrui Zhang et al.
UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation
Kefu Yi, Kai Luo, Xiaolei Luo et al.
HIPTrack: Visual Tracking with Historical Prompts
Wenrui Cai, Qingjie Liu, Yunhong Wang
Single-Model and Any-Modality for Video Object Tracking
Zongwei Wu, Jilai Zheng, Xiangxuan Ren et al.
Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline
Xiao Wang, Shiao Wang, Chuanming Tang et al.
Temporal Adaptive RGBT Tracking with Modality Prompt
Hongyu Wang, Xiaotao Liu, Yifan Li et al.
DiffusionTrack: Diffusion Model for Multi-Object Tracking
Run Luo, Zikai Song, Lintao Ma et al.
DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction
Weiyi Lv, Yuhang Huang, NING Zhang et al.
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
Songhao Han, Wei Huang, Hairong Shi et al.
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu, Chenlin Zhang, Chen Zhao et al.
M-LLM Based Video Frame Selection for Efficient Video Understanding
Kai Hu, Feng Gao, Xiaohan Nie et al.
LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry
Weirong Chen, Le Chen, Rui Wang et al.
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Yuqian Yuan, Hang Zhang, Wentong Li et al.
Scene Adaptive Sparse Transformer for Event-based Object Detection
Yansong Peng, Li Hebei, Yueyi Zhang et al.
Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking
Xiantao Hu, Ying Tai, Xu Zhao et al.
SUTrack: Towards Simple and Unified Single Object Tracking
Xin Chen, Ben Kang, Wanting Geng et al.
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Mark YU, Wenbo Hu, Jinbo Xing et al.
ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association
Shuxiao Ding, Lukas Schneider, Marius Cordts et al.
Towards Generalizable Multi-Object Tracking
Zheng Qin, Le Wang, Sanping Zhou et al.
REACTO: Reconstructing Articulated Objects from a Single Video
Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo et al.
LEOD: Label-Efficient Object Detection for Event Cameras
Ziyi Wu, Mathias Gehrig, Qing Lyu et al.
Exploring Enhanced Contextual Information for Video-Level Object Tracking
Ben Kang, Xin Chen, Simiao Lai et al.
Sparse Global Matching for Video Frame Interpolation with Large Motion
Chunxu Liu, Guozhen Zhang, Rui Zhao et al.
Trackastra: Transformer-based cell tracking for live-cell microscopy
Benjamin Gallusser, Weigert Martin
Multi-Object Tracking in the Dark
Xinzhe Wang, Kang Ma, Qiankun Liu et al.
MotionFollower: Editing Video Motion via Score-Guided Diffusion
Shuyuan Tu, Qi Dai, Zihao Zhang et al.
Self-Supervised Multi-Object Tracking with Path Consistency
Zijia Lu, Bing Shuai, Yanbei Chen et al.
HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Peng Dai, Yang Zhang, Tao Liu et al.
Guided Slot Attention for Unsupervised Video Object Segmentation
Minhyeok Lee, Suhwan Cho, Dogyoon Lee et al.
Learning to Predict Activity Progress by Self-Supervised Video Alignment
Gerard Donahue, Ehsan Elhamifar
FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking
Seokju Cho, Gabriel Huang, Seungryong Kim et al.
DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World Videos
Arjun Balasingam, Joseph Chandler, Chenning Li et al.
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Min Yang, gaohuan, Ping Guo et al.
Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes
Gaurav Shrivastava, Abhinav Shrivastava
Track-On: Transformer-based Online Point Tracking with Memory
GΓΆrkay Aydemir, Xiongyi Cai, Weidi Xie et al.
AllTracker: Efficient Dense Point Tracking at High Resolution
Adam Harley, Yang You, Yang Zheng et al.
Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking
Jiawen Zhu, Huayi Tang, Xin Chen et al.
3D Multi-frame Fusion for Video Stabilization
Zhan Peng, Xinyi Ye, Weiyue Zhao et al.
Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking
Yan Gao, Haojun Xu, Jie Li et al.
M3SOT: Multi-Frame, Multi-Field, Multi-Space 3D Single Object Tracking
Jiaming Liu, Yue Wu, Maoguo Gong et al.
MDP3: A Training-free Approach for List-wise Frame Selection in Video-LLMs
Hui Sun, Shiyin Lu, Huanyu Wang et al.
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation
Guy Yariv, Yuval Kirstain, Amit Zohar et al.
Semantic and Sequential Alignment for Referring Video Object Segmentation
Feiyu Pan, Hao Fang, Fangkai Li et al.
Instance Tracking in 3D Scenes from Egocentric Videos
Yunhan Zhao, Haoyu Ma, Shu Kong et al.
XTrack: Multimodal Training Boosts RGB-X Video Object Trackers
Yuedong Tan, Zongwei Wu, Yuqian Fu et al.
ObjectMover: Generative Object Movement with Video Prior
Xin Yu, Tianyu Wang, Soo Ye Kim et al.
Emergent Temporal Correspondences from Video Diffusion Transformers
Jisu Nam, Soowon Son, Dahyun Chung et al.
MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking
Haolin Qin, Tingfa Xu, Tianhao Li et al.
Focusing on Tracks for Online Multi-Object Tracking
Kyujin Shim, Kangwook Ko, YuJin Yang et al.
SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking
Siyuan Li, Lei Ke, Yung-Hsu Yang et al.
Exploring Historical Information for RGBE Visual Tracking with Mamba
Chuanyu Sun, Jiqing Zhang, Yang Wang et al.
4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians
Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
Jinyang Li, En Yu, Sijia Chen et al.
6D Object Pose Tracking in Internet Videos for Robotic Manipulation
Georgy Ponimatkin, Martin CΓfka, Tomas Soucek et al.
Dual Conditioned Motion Diffusion for Pose-Based Video Anomaly Detection
Hongsong Wang, Andi Xu, Pinle Ding et al.
Recognizing Ultra-High-Speed Moving Objects with Bio-Inspired Spike Camera
Junwei Zhao, Shiliang Zhang, Zhaofei Yu et al.
A Unified Framework for Human-centric Point Cloud Video Understanding
Yiteng Xu, Kecheng Ye, xiao han et al.
Omnidirectional Multi-Object Tracking
Kai Luo, Hao Shi, Sheng Wu et al.
Fine-grained Spatiotemporal Grounding on Egocentric Videos
Shuo LIANG, Yiwu Zhong, Zi-Yuan Hu et al.
Projecting Trackable Thermal Patterns for Dynamic Computer Vision
Mark Sheinin, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan
Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance
Sanchayan Santra, Vishal Chudasama, Pankaj Wasnik et al.
Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow
Ruyang Liu, Shangkun Sun, Haoran Tang et al.
TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion
Haoyue Liu, Jinghan Xu, Yi Chang et al.
Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Object Appearance Graphs
Mattia Segu, Luigi Piccinelli, Siyuan Li et al.
PoseTraj: Pose-Aware Trajectory Control in Video Diffusion
longbin ji, Lei Zhong, Pengfei Wei et al.
VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking
Jens Hellekes, Manuel MΓΌhlhaus, Reza Bahmanyar et al.
TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation
Zonglin Lyu, Chen Chen
What You Have is What You Track: Adaptive and Robust Multimodal Tracking
Yuedong Tan, Jiawei Shao, Eduard Zamfir et al.
Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better
Zihang Lai, Andrea Vedaldi
Video Individual Counting for Moving Drones
Yaowu Fan, Jia Wan, Tao Han et al.
HumanMM: Global Human Motion Recovery from Multi-shot Videos
Yuhong Zhang, Guanlin Wu, Ling-Hao Chen et al.
Event2Tracking: Reconstructing Multi-Agent Soccer Trajectories Using Long-Term Multimodal Context
Harry Hughes, Michael Horton, Xinyu Wei et al.
Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker
Xinyu Xiang, Qinglong Yan, Hao Zhang et al.
Exploiting Continuous Motion Clues for Vision-Based Occupancy Prediction
Haoran Xu, Peixi Peng, Xinyi Zhang et al.
TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
Jinxi Li, Ziyang Song, Bo Yang
Hand-held Object Reconstruction from RGB Video with Dynamic Interaction
Shijian Jiang, Qi Ye, Rengan Xie et al.
BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
Wonyong Seo, Jihyong Oh, Munchurl Kim
Track Any Anomalous Object:A Granular Video Anomaly Detection Pipeline
Yuzhi Huang, Chenxin Li, Haitao Zhang et al.
Everything is a Video: Unifying Modalities through Next-Frame Prediction
G Thomas Hudson, Dean Slack, Thomas Winterbottom et al.
PvNeXt: Rethinking Network Design and Temporal Motion for Point Cloud Video Recognition
Jie Wang, Tingfa Xu, Lihe Ding et al.
GLOMA: Global Video Text Spotting with Morphological Association
Han Wang, Yanjie Wang, Yang Li et al.
S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking
Tao Tang, Lijun Zhou, Pengkun Hao et al.
Efficient Motion Prompt Learning for Robust Visual Tracking
Jie Zhao, Xin Chen, Yongsheng Yuan et al.
TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
Jiahao Lu, Weitao Xiong, Jiacheng Deng et al.
Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
Ilona Demler, Saumya Chauhan, Georgia Gkioxari
Motion-Zero: A Zero-Shot Trajectory Control Framework of Moving Object for Diffusion-Based Video Generation
Changgu Chen, Junwei Shu, Gaoqi He et al.
FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video
Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado et al.
Multi-View 3D Point Tracking
Frano RajiΔ, Haofei Xu, Marko Mihajlovic et al.
TAPTR: Tracking Any Point with Transformers as Detection
Hongyang Li, Hao Zhang, Shilong Liu et al.
DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking
Mingzhe Guo, Weiping Tan, Wenyu Ran et al.
Spatial-Temporal Multi-level Association for Video Object Segmentation
Deshui Miao, Xin Li, Zhenyu He et al.
Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation
Hyeonho Jeong, Chun-Hao P. Huang, Jong Chul Ye et al.
Local All-Pair Correspondence for Point Tracking
Seokju Cho, Jiahui Huang, Jisu Nam et al.
VideoOrion: Tokenizing Object Dynamics in Videos
Yicheng Feng, Yijiang Li, Wanpeng Zhang et al.
COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion
Zekun Qian, Ruize Han, Zhixiang Wang et al.