Sergey Tulyakov
66
Papers
982
Total Citations
Papers (66)
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
CVPR 2024
341
citations
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
CVPR 2024
168
citations
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
ICLR 2025arXiv
114
citations
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
CVPR 2025
78
citations
Wonderland: Navigating 3D Scenes from a Single Image
CVPR 2025
54
citations
Multi-subject Open-set Personalization in Video Generation
CVPR 2025arXiv
40
citations
SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors
CVPR 2024
40
citations
Improving the Diffusability of Autoencoders
ICML 2025
34
citations
Scalable Ranked Preference Optimization for Text-to-Image Generation
ICCV 2025
21
citations
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
CVPR 2025
20
citations
Video Motion Transfer with Diffusion Transformers
CVPR 2025
18
citations
4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
CVPR 2025
18
citations
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
ICCV 2025
12
citations
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
NeurIPS 2025
11
citations
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
NeurIPS 2025
8
citations
Efficient Training with Denoised Neural Weights
ECCV 2024
5
citations
Flow Guided Transformable Bottleneck Networks for Motion Retargeting
CVPR 2021arXiv
0
citations
Motion Representations for Articulated Animation
CVPR 2021arXiv
0
citations
Playable Video Generation
CVPR 2021arXiv
0
citations
Teachers Do More Than Teach: Compressing Image-to-Image Models
CVPR 2021arXiv
0
citations
Playable Environments: Video Manipulation in Space and Time
CVPR 2022
0
citations
InOut: Diverse Image Outpainting via GAN Inversion
CVPR 2022
0
citations
Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
CVPR 2022arXiv
0
citations
StyleGAN-V: A Continuous Video Generator With the Price, Image Quality and Perks of StyleGAN2
CVPR 2022
0
citations
DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-Aware Scene Synthesis
CVPR 2023arXiv
0
citations
Make-a-Story: Visual Memory Conditioned Consistent Story Generation
CVPR 2023
0
citations
SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
CVPR 2023arXiv
0
citations
Invertible Neural Skinning
CVPR 2023arXiv
0
citations
Affection: Learning Affective Explanations for Real-World Visual Data
CVPR 2023arXiv
0
citations
Real-Time Neural Light Field on Mobile Devices
CVPR 2023arXiv
0
citations
3DAvatarGAN: Bridging Domains for Personalized Editable Avatars
CVPR 2023arXiv
0
citations
Unsupervised Volumetric Animation
CVPR 2023arXiv
0
citations
ShapeTalk: A Language Dataset and Framework for 3D Shape Edits and Deformations
CVPR 2023
0
citations
Can Text-to-Video Generation help Video-Language Alignment?
CVPR 2025
0
citations
Transformable Bottleneck Networks
ICCV 2019
0
citations
Laplace Landmark Localization
ICCV 2019
0
citations
Rethinking Vision Transformers for MobileNet Size and Speed
ICCV 2023arXiv
0
citations
Text2Tex: Text-driven Texture Synthesis via Diffusion Models
ICCV 2023arXiv
0
citations
InfiniCity: Infinite-Scale City Synthesis
ICCV 2023arXiv
0
citations
Neural Hair Rendering
ECCV 2020
0
citations
Cross-Modal 3D Shape Generation and Manipulation
ECCV 2022
0
citations
R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
ECCV 2022
0
citations
Quantized GAN for Complex Music Generation from Dance Videos
ECCV 2022
0
citations
Regressing a 3D Face Shape From a Single Image
ICCV 2015
0
citations
Mind the Time: Temporally-Controlled Multi-Event Video Generation
CVPR 2025
0
citations
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training
CVPR 2025
0
citations
Omni-ID: Holistic Identity Representation Designed for Generative Tasks
CVPR 2025
0
citations
T2Bs: Text-to-Character Blendshapes via Video Generation
ICCV 2025
0
citations
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
ICCV 2025
0
citations
SPAD: Spatially Aware Multi-View Diffusers
CVPR 2024
0
citations
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
CVPR 2024
0
citations
TextCraftor: Your Text Encoder Can be Image Quality Controller
CVPR 2024
0
citations
Towards Text-guided 3D Scene Composition
CVPR 2024
0
citations
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
CVPR 2024
0
citations
E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation
ICML 2024
0
citations
Self-Adaptive Matrix Completion for Heart Rate Estimation From Face Videos Under Realistic Conditions
CVPR 2016
0
citations
MoCoGAN: Decomposing Motion and Content for Video Generation
CVPR 2018arXiv
0
citations
Animating Arbitrary Objects via Deep Motion Transfer
CVPR 2019
0
citations
3D Guided Fine-Grained Face Manipulation
CVPR 2019
0
citations
First Order Motion Model for Image Animation
NeurIPS 2019
0
citations
EfficientFormer: Vision Transformers at MobileNet Speed
NeurIPS 2022
0
citations
Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training
NeurIPS 2022
0
citations
EpiGRAF: Rethinking training of 3D GANs
NeurIPS 2022
0
citations
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds
NeurIPS 2023
0
citations
LightSpeed: Light and Fast Neural Light Fields on Mobile Devices
NeurIPS 2023
0
citations
Autodecoding Latent 3D Diffusion Models
NeurIPS 2023
0
citations