Sergey Tulyakov
28
Papers
983
Total Citations
Papers (28)
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
CVPR 2024
341
citations
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
CVPR 2024
168
citations
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
ICLR 2025arXiv
114
citations
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
CVPR 2025
78
citations
Wonderland: Navigating 3D Scenes from a Single Image
CVPR 2025
54
citations
Multi-subject Open-set Personalization in Video Generation
CVPR 2025arXiv
40
citations
SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors
CVPR 2024
40
citations
Improving the Diffusability of Autoencoders
ICML 2025
34
citations
Scalable Ranked Preference Optimization for Text-to-Image Generation
ICCV 2025
21
citations
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
CVPR 2025
20
citations
4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
CVPR 2025
18
citations
Video Motion Transfer with Diffusion Transformers
CVPR 2025
18
citations
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
ICCV 2025
12
citations
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
NeurIPS 2025
11
citations
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
NeurIPS 2025
8
citations
Efficient Training with Denoised Neural Weights
ECCV 2024
5
citations
Can Text-to-Video Generation help Video-Language Alignment?
CVPR 2025arXiv
1
citations
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
ICCV 2025
0
citations
SPAD: Spatially Aware Multi-View Diffusers
CVPR 2024
0
citations
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
CVPR 2024
0
citations
TextCraftor: Your Text Encoder Can be Image Quality Controller
CVPR 2024
0
citations
T2Bs: Text-to-Character Blendshapes via Video Generation
ICCV 2025
0
citations
Towards Text-guided 3D Scene Composition
CVPR 2024
0
citations
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
CVPR 2024
0
citations
Omni-ID: Holistic Identity Representation Designed for Generative Tasks
CVPR 2025
0
citations
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training
CVPR 2025
0
citations
Mind the Time: Temporally-Controlled Multi-Event Video Generation
CVPR 2025
0
citations
E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
ICML 2024
0
citations