"video generation" Papers

19 papers found

CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation

Gaojie Lin, Jianwen Jiang, Chao Liang et al.

ICLR 2025poster
17
citations

Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling

Aram Davtyan, Leello Dadi, Volkan Cevher et al.

ICLR 2025poster
5
citations

GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control

Xuanchi Ren, Tianchang Shen, Jiahui Huang et al.

CVPR 2025highlightarXiv:2503.03751
138
citations

Hierarchical Flow Diffusion for Efficient Frame Interpolation

Yang Hai, Guo Wang, Tan Su et al.

CVPR 2025posterarXiv:2504.00380
2
citations

Improved Video VAE for Latent Video Diffusion Model

Pingyu Wu, Kai Zhu, Yu Liu et al.

CVPR 2025posterarXiv:2411.06449
19
citations

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

Jaihoon Kim, Taehoon Yoon, Jisung Hwang et al.

NeurIPS 2025posterarXiv:2503.19385
20
citations

MET3R: Measuring Multi-View Consistency in Generated Images

Mohammad Asim, Christopher Wewer, Thomas Wimmer et al.

CVPR 2025posterarXiv:2501.06336
43
citations

RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation

Tianyi Yan, Wencheng Han, xia zhou et al.

NeurIPS 2025posterarXiv:2509.16500

RoboScape: Physics-informed Embodied World Model

Yu Shang, Xin Zhang, Yinzhou Tang et al.

NeurIPS 2025oralarXiv:2506.23135
15
citations

Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists

Bojia Zi, Penghui Ruan, Marco Chen et al.

NeurIPS 2025posterarXiv:2502.06734
25
citations

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

Yining Hong, Beide Liu, Maxine Wu et al.

ICLR 2025oralarXiv:2410.23277
17
citations

VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers

Qinkai XU, yijin liu, YangChen et al.

NeurIPS 2025oral

VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption

Tianxiong Zhong, Xingye Tian, Boyuan Jiang et al.

NeurIPS 2025oralarXiv:2505.12053
3
citations

VORTA: Efficient Video Diffusion via Routing Sparse Attention

Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.

NeurIPS 2025posterarXiv:2505.18809
7
citations

BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering

Xinmin Qiu, Congying Han, Zicheng Zhang et al.

ECCV 2024posterarXiv:2403.06243

Boximator: Generating Rich and Controllable Motions for Video Synthesis

Jiawei Wang, Yuchen Zhang, Jiaxin Zou et al.

ICML 2024poster

Learn the Force We Can: Enabling Sparse Motion Control in Multi-Object Video Generation

Aram Davtyan, Paolo Favaro

AAAI 2024paperarXiv:2306.03988

Position: Video as the New Language for Real-World Decision Making

Sherry Yang, Jacob C Walker, Jack Parker-Holder et al.

ICML 2024poster

RoboDreamer: Learning Compositional World Models for Robot Imagination

Siyuan Zhou, Yilun Du, Jiaben Chen et al.

ICML 2024poster