🧬Video Analysis

Optical Flow

Estimating motion between frames

100 papers2,926 total citations

Compare with other topics

Feb '24 — Jan '26706 papers

Top Conferences

CVPR: 50 ECCV: 14 ICCV: 13 AAAI: 10 ICLR: 9 NeurIPS: 3

Top Papers

#1

Mean Flows for One-step Generative Modeling

Zhengyang Geng, Mingyang Deng, Xingjian Bai et al.

SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow

Yihan Wang, Lahav Lipson, Jia Deng

Taming Rectified Flow for Inversion and Editing

Jiangshan Wang, Junfu Pu, Zhongang Qi et al.

Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

Rafail Fridman, Danah Yatim, Omer Bar-Tal et al.

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

Chong Mou, Xintao Wang, Jiechong Song et al.

DepthFM: Fast Generative Monocular Depth Estimation with Flow Matching

Ming Gui, Johannes Schusterbauer, Ulrich Prestel et al.

CCEdit: Creative and Controllable Video Editing via Diffusion Models

Ruoyu Feng, Wenming Weng, Yanhui Wang et al.

Optimizing Diffusion Noise Can Serve As Universal Motion Priors

Korrawe Karunratanakul, Konpat Preechakul, Emre Aksan et al.

DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction

Weiyi Lv, Yuhang Huang, NING Zhang et al.

Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

Ryan Burgert, Yuancheng Xu, Wenqi Xian et al.

CVPR 2025arXiv:2501.08331

video diffusion modelsmotion controlnoise warping algorithmoptical flow fields+4

59

citations

#11

Seamless Human Motion Composition with Blended Positional Encodings

German Barquero, Sergio Escalera, Cristina Palmero

Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis

Yanzuo Lu, Manlin Zhang, Jinhua Ma et al.

FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection

Yao Xiao, Tingfa Xu, Yu Xin et al.

Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models

Lvmin Zhang, Shengqu Cai, Muyang Li et al.

Stable Flow: Vital Layers for Training-Free Image Editing

Omri Avrahami, Or Patashnik, Ohad Fried et al.

MemFlow: Optical Flow Estimation and Prediction with Memory

Qiaole Dong, Yanwei Fu

FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models

Shivangi Aneja, Justus Thies, Angela Dai et al.

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

Shuai Yang, Yifan Zhou, Ziwei Liu et al.

Neural Markov Random Field for Stereo Matching

Tongfan Guan, Chen Wang, Yun-Hui Liu

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Han Liang, Jiacheng Bao, Ruichi Zhang et al.

Efficient Multi-scale Network with Learnable Discrete Wavelet Transform for Blind Motion Deblurring

Xin Gao, Tianheng Qiu, Xinyu Zhang et al.

A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames

Pinelopi Papalampidi, Skanda Koppula, Shreya Pathak et al.

SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction

Yang Zhou, Hao Shao, Letian Wang et al.

Trajectory attention for fine-grained video motion control

Zeqi Xiao, Wenqi Ouyang, Yifan Zhou et al.

Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture

Fei Wang, Dan Guo, Kun Li et al.

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

Ke Fan, Junshu Tang, Weijian Cao et al.

ECCV 2024arXiv:2405.15763

text-to-motion synthesismulti-person motion generationconditional motion distributionmotion spatial control+1

35

citations

#27

ICP-Flow: LiDAR Scene Flow Estimation with ICP

Yancong Lin, Holger Caesar

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Qingwen Zhang, Yi Yang, Peizheng Li et al.

ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction

Zhicheng Zhang, Junyao Hu, Wentao Cheng et al.

FlowIE: Efficient Image Enhancement via Rectified Flow

Yixuan Zhu, Wenliang Zhao, Ao Li et al.

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo et al.

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

Xiaojuan Wang, Boyang Zhou, Brian Curless et al.

Light3R-SfM: Towards Feed-forward Structure-from-Motion

Sven Elflein, Qunjie Zhou, Laura Leal-Taixe

CVPR 2025arXiv:2501.14914

structure-from-motioncamera pose estimationattention mechanismscene graph construction+4

27

citations

#34

Sparse Global Matching for Video Frame Interpolation with Large Motion

Chunxu Liu, Guozhen Zhang, Rui Zhao et al.

Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations

Rui Zhao, Ruiqin Xiong, Jing Zhao et al.

Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion

Linzhan Mou, Jun-Kun Chen, Yu-Xiong Wang

Offline and Online Optical Flow Enhancement for Deep Video Compression

Chuanbo Tang, Xihua Sheng, Zhuoyuan Li et al.

AAAI 2024arXiv:2307.05092

optical flow estimationvideo compressionrate-distortion trade-offmotion estimation+3

25

citations

#38

AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction

Zhen Xing, Qi Dai, Zejia Weng et al.

FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning

Chenhao Li, Elijah Stanger-Jones, Steve Heim et al.

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Fangqiang Ding, Zhen Luo, Peijun Zhao et al.

Object-Centric Diffusion for Efficient Video Editing

Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.

ECCV 2024arXiv:2401.05735

diffusion-based video editingobject-centric samplingtoken mergingcomputational efficiency+4

22

citations

#42

Spatio-Temporal Turbulence Mitigation: A Translational Perspective

Xingguang Zhang, Nicholas M Chimitt, Yiheng Chi et al.

MotionFollower: Editing Video Motion via Score-Guided Diffusion

Shuyuan Tu, Qi Dai, Zihao Zhang et al.

GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion

Jiapeng Tang, Davide Davoli, Tobias Kirschstein et al.

MagicMirror: ID-Preserved Video Generation in Video Diffusion Transformers

Yuechen Zhang, YaoYang Liu, Bin Xia et al.

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

Zhaoyuan Yang, Zhengyang Yu, Zhiwei Xu et al.

FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking

Seokju Cho, Gabriel Huang, Seungryong Kim et al.

TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video

Minye Wu, Zehao Wang, Georgios Kouros et al.

CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction

Zhangchen Ye, Tao Jiang, Chenfeng Xu et al.

ECCV 2024arXiv:2409.13430

3d occupancy predictioncost volume fusiontemporal feature integrationmonocular depth estimation+3

19

citations

#50

Spectral Motion Alignment for Video Motion Transfer Using Diffusion Models

Geon Yeong Park, Hyeonho Jeong, Sang Wan Lee et al.

LayoutFlow: Flow Matching for Layout Generation

Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui et al.

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Tian-Xing Xu, Xiangjun Gao, Wenbo Hu et al.

ICCV 2025arXiv:2504.01016

video depth estimationaffine-invariant predictionspoint map sequences3d/4d reconstruction+4

19

citations

#53

MoST: Motion Style Transformer Between Diverse Action Contents

Boeun Kim, Jungho Kim, Hyung Jin Chang et al.

Video Motion Transfer with Diffusion Transformers

Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov et al.

Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring

Huicong Zhang, Haozhe Xie, Hongxun Yao

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

Chaoyang Wang, Peiye Zhuang, Tuan Duc Ngo et al.

Understanding Optimization in Deep Learning with Central Flows

Jeremy Cohen, Alex Damian, Ameet Talwalkar et al.

HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos

Jinglei Zhang, Jiankang Deng, Chao Ma et al.

Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation

Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis et al.

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

Zhuoman Liu, Weicai Ye, Yan Luximon et al.

Programmable Motion Generation for Open-Set Motion Control Tasks

Hanchao Liu, Xiaohang Zhan, Shaoli Huang et al.

Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking

Jiawen Zhu, Huayi Tang, Xin Chen et al.

OmniMotionGPT: Animal Motion Generation with Limited Data

Zhangsihao Yang, Mingyuan Zhou, Mengyi Shan et al.

VMBench: A Benchmark for Perception-Aligned Video Motion Generation

Xinran Ling, Chen Zhu, Meiqi Wu et al.

GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow

Simon Boeder, Fabian Gigengack, Benjamin Risse

Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model

Hang Zhou, Jiale Cai, Yuteng Ye et al.

Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction

Jianping Jiang, xinyu zhou, Bingxuan Wang et al.

Video Diffusion Models Are Strong Video Inpainter

Minhyeok Lee, Suhwan Cho, Chajin Shin et al.

MoVideo: Motion-Aware Video Generation with Diffusion Models

Jingyun Liang, Yuchen Fan, Kai Zhang et al.

Temporal Event Stereo via Joint Learning with Stereoscopic Flow

Hoonhee Cho, Jae-young Kang, Kuk-Jin Yoon

ECCV 2024arXiv:2407.10831

event camerasstereo matchingstereoscopic flowtemporal aggregation+3

14

citations

#71

Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation

Yingjie Chen, Yifang Men, Yuan Yao et al.

ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression

Wei Jiang, Junru Li, Kai Zhang et al.

CVPR 2025arXiv:2410.09706

learned video compressioninter prediction enhancementnon-local correlationstemporal context mining+4

13

citations

#73

SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation

Jiaben Chen, Huaizu Jiang

3D Multi-frame Fusion for Video Stabilization

Zhan Peng, Xinyi Ye, Weiyue Zhao et al.

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Shuwei Shi, Biao Gong, Xi Chen et al.

MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow

Hanzhuo Huang, Yuan Liu, Ge Zheng et al.

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Yuchen Zhang, Nikhil Keetha, Chenwei Lyu et al.

NeurIPS 2025arXiv:2506.09278

dense image correspondenceoptical flow estimationwide-baseline matchingunified correspondence model+4

13

citations

#78

MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm

Ziyan Guo, Zeyu HU, Na Zhao et al.

ICCV 2025arXiv:2502.02358

human motion generationmotion editingrectified flowsmotion-condition-motion paradigm+4

12

citations

#79

Long-term Temporal Context Gathering for Neural Video Compression

Linfeng Qi, Zhaoyang Jia, Jiahao Li et al.

MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting

Sangwoon Kwak, Joonsoo Kim, Jun Young Jeong et al.

SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing

Seokhyeon Hong, Chaelin Kim, Serin Yoon et al.

ZeroFlow: Scalable Scene Flow via Distillation

Kyle Vedder, Neehar Peri, Nathaniel Chodosh et al.

Zero-Shot Monocular Scene Flow Estimation in the Wild

Yiqing Liang, Abhishek Badki, Hang Su et al.

MaskControl: Spatio-Temporal Control for Masked Motion Synthesis

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Korrawe Karunratanakul et al.

FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

Yabo Zhang, xinpeng zhou, Yihan Zeng et al.

FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing

Tianyi Wei, Yifan Zhou, Dongdong Chen et al.

CMTA: Cross-Modal Temporal Alignment for Event-guided Video Deblurring

Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon

RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

Peng Liu, Dongyang Dai, Zhiyong Wu

4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video

Qiang Hu, Zihan Zheng, Houqiang Zhong et al.

FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video

Yue Gao, Hong-Xing Yu, Bo Zhu et al.

LBM: Latent Bridge Matching for Fast Image-to-Image Translation

Clément Chadebec, Onur Tasar, Sanjeev Sreetharan et al.

Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry

Jannis Chemseddine, Christian Wald, Richard Duong et al.

Data-Efficient Unsupervised Interpolation Without Any Intermediate Frame for 4D Medical Images

JungEun Kim, Hangyul Yoon, Geondo Park et al.

Continuous Piecewise-Affine Based Motion Model for Image Animation

Hexiang Wang, Fengqi Liu, Qianyu Zhou et al.

AAAI 2024arXiv:2401.09146

image animationmotion transferdiffeomorphism spacescontinuous piecewise-affine transformation+4

11

citations

#95

A Theory of Joint Light and Heat Transport for Lambertian Scenes

Mani Ramanagopal, Sriram Narayanan, Aswin C. Sankaranarayanan et al.

Multi-View Dynamic Reflection Prior for Video Glass Surface Detection

Fang Liu, Yuhao Liu, Jiaying Lin et al.

Motion and Structure from Event-based Normal Flow

Zhongyang Ren, Bangyan Liao, Delei Kong et al.

Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think

Zhenyi Lu, Xiaoye Qu, Zhenyi Lu et al.

Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling

Hanyang Kong, Xingyi Yang, Xinchao Wang

RoMo: Robust Motion Segmentation Improves Structure from Motion

Lily Goli, Sara Sabour, Mark Matthews et al.

ICCV 2025arXiv:2411.18650

motion segmentationstructure from motioncamera calibrationoptical flow+4

10

citations

Optical Flow

Top Conferences

Related Topics (Video Analysis)

Top Papers

Mean Flows for One-step Generative Modeling

SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow

Taming Rectified Flow for Inversion and Editing

Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

DepthFM: Fast Generative Monocular Depth Estimation with Flow Matching

CCEdit: Creative and Controllable Video Editing via Diffusion Models

Optimizing Diffusion Noise Can Serve As Universal Motion Priors

DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction

Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

Seamless Human Motion Composition with Blended Positional Encodings

Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis

FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection

Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models

Stable Flow: Vital Layers for Training-Free Image Editing

MemFlow: Optical Flow Estimation and Prediction with Memory

FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

Neural Markov Random Field for Stereo Matching

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Efficient Multi-scale Network with Learnable Discrete Wavelet Transform for Blind Motion Deblurring

A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames

SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction

Trajectory attention for fine-grained video motion control

Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

ICP-Flow: LiDAR Scene Flow Estimation with ICP

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction

FlowIE: Efficient Image Enhancement via Rectified Flow

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

Light3R-SfM: Towards Feed-forward Structure-from-Motion

Sparse Global Matching for Video Frame Interpolation with Large Motion

Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations

Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion

Offline and Online Optical Flow Enhancement for Deep Video Compression

AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction

FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Object-Centric Diffusion for Efficient Video Editing

Spatio-Temporal Turbulence Mitigation: A Translational Perspective

MotionFollower: Editing Video Motion via Score-Guided Diffusion

GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion

MagicMirror: ID-Preserved Video Generation in Video Diffusion Transformers

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking

TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video

CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction

Spectral Motion Alignment for Video Motion Transfer Using Diffusion Models

LayoutFlow: Flow Matching for Layout Generation

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

MoST: Motion Style Transformer Between Diverse Action Contents

Video Motion Transfer with Diffusion Transformers

Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

Understanding Optimization in Deep Learning with Central Flows

HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos

Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

Programmable Motion Generation for Open-Set Motion Control Tasks

Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking

OmniMotionGPT: Animal Motion Generation with Limited Data

VMBench: A Benchmark for Perception-Aligned Video Motion Generation

GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow

Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model

Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction

Video Diffusion Models Are Strong Video Inpainter

MoVideo: Motion-Aware Video Generation with Diffusion Models

Temporal Event Stereo via Joint Learning with Stereoscopic Flow

Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation

ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression

SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation

3D Multi-frame Fusion for Video Stabilization

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow