Sergey Tulyakov

28
Papers
983
Total Citations

Papers (28)

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

CVPR 2024
341
citations

4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling

CVPR 2024
168
citations

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

ICLR 2025arXiv
114
citations

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

CVPR 2025
78
citations

Wonderland: Navigating 3D Scenes from a Single Image

CVPR 2025
54
citations

Multi-subject Open-set Personalization in Video Generation

CVPR 2025arXiv
40
citations

SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors

CVPR 2024
40
citations

Improving the Diffusability of Autoencoders

ICML 2025
34
citations

Scalable Ranked Preference Optimization for Text-to-Image Generation

ICCV 2025
21
citations

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

CVPR 2025
20
citations

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

CVPR 2025
18
citations

Video Motion Transfer with Diffusion Transformers

CVPR 2025
18
citations

MaskControl: Spatio-Temporal Control for Masked Motion Synthesis

ICCV 2025
12
citations

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

NeurIPS 2025
11
citations

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

NeurIPS 2025
8
citations

Efficient Training with Denoised Neural Weights

ECCV 2024
5
citations

Can Text-to-Video Generation help Video-Language Alignment?

CVPR 2025arXiv
1
citations

AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation

ICCV 2025
0
citations

SPAD: Spatially Aware Multi-View Diffusers

CVPR 2024
0
citations

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

CVPR 2024
0
citations

TextCraftor: Your Text Encoder Can be Image Quality Controller

CVPR 2024
0
citations

T2Bs: Text-to-Character Blendshapes via Video Generation

ICCV 2025
0
citations

Towards Text-guided 3D Scene Composition

CVPR 2024
0
citations

Hierarchical Patch Diffusion Models for High-Resolution Video Generation

CVPR 2024
0
citations

Omni-ID: Holistic Identity Representation Designed for Generative Tasks

CVPR 2025
0
citations

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

CVPR 2025
0
citations

Mind the Time: Temporally-Controlled Multi-Event Video Generation

CVPR 2025
0
citations

E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

ICML 2024
0
citations