"video generation" Papers
19 papers found
CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
Gaojie Lin, Jianwen Jiang, Chao Liang et al.
Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling
Aram Davtyan, Leello Dadi, Volkan Cevher et al.
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control
Xuanchi Ren, Tianchang Shen, Jiahui Huang et al.
Hierarchical Flow Diffusion for Efficient Frame Interpolation
Yang Hai, Guo Wang, Tan Su et al.
Improved Video VAE for Latent Video Diffusion Model
Pingyu Wu, Kai Zhu, Yu Liu et al.
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim, Taehoon Yoon, Jisung Hwang et al.
MET3R: Measuring Multi-View Consistency in Generated Images
Mohammad Asim, Christopher Wewer, Thomas Wimmer et al.
RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
Tianyi Yan, Wencheng Han, xia zhou et al.
RoboScape: Physics-informed Embodied World Model
Yu Shang, Xin Zhang, Yinzhou Tang et al.
Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
Bojia Zi, Penghui Ruan, Marco Chen et al.
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong, Beide Liu, Maxine Wu et al.
VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers
Qinkai XU, yijin liu, YangChen et al.
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Tianxiong Zhong, Xingye Tian, Boyuan Jiang et al.
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.
BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering
Xinmin Qiu, Congying Han, Zicheng Zhang et al.
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Jiawei Wang, Yuchen Zhang, Jiaxin Zou et al.
Learn the Force We Can: Enabling Sparse Motion Control in Multi-Object Video Generation
Aram Davtyan, Paolo Favaro
Position: Video as the New Language for Real-World Decision Making
Sherry Yang, Jacob C Walker, Jack Parker-Holder et al.
RoboDreamer: Learning Compositional World Models for Robot Imagination
Siyuan Zhou, Yilun Du, Jiaben Chen et al.