Ziwei Liu
154
Papers
3,483
Total Citations
10
h-index
Papers (154)
VBench: Comprehensive Benchmark Suite for Video Generative Models
CVPR 2024
996
citations
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
ECCV 2024
616
citations
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
ICLR 2024
408
citations
Knowledge Distillation Meets Self-Supervision
ECCV 2020
319
citations
SinSR: Diffusion-Based Image Super-Resolution in a Single Step
CVPR 2024
214
citations
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
ICLR 2024
209
citations
VideoBooth: Diffusion-based Video Generation with Image Prompts
CVPR 2024
118
citations
Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation
ECCV 2020
95
citations
InstructVideo: Instructing Video Diffusion Models with Human Feedback
CVPR 2024
80
citations
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
CVPR 2024
49
citations
Digital Life Project: Autonomous 3D Characters with Social Intelligence
CVPR 2024
46
citations
Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment
ICLR 2024
45
citations
AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation
CVPR 2024
39
citations
Generative Gaussian Splatting for Unbounded 3D City Generation
CVPR 2025
32
citations
Towards Language-Driven Video Inpainting via Multimodal Large Language Models
CVPR 2024
30
citations
Multi-Space Alignments Towards Universal LiDAR Segmentation
CVPR 2024
30
citations
VistaDream: Sampling multiview consistent images for single-view scene reconstruction
ICCV 2025
27
citations
Material Anything: Generating Materials for Any 3D Object via Diffusion
CVPR 2025
22
citations
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
CVPR 2025arXiv
19
citations
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
ICLR 2025
18
citations
Move Anything with Layered Scene Diffusion
CVPR 2024
13
citations
EgoLM: Multi-Modal Language Model of Egocentric Motions
CVPR 2025
12
citations
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters
CVPR 2025
9
citations
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography
ICCV 2025
7
citations
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
NeurIPS 2025
7
citations
Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion
CVPR 2025
7
citations
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
ICCV 2025
5
citations
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
ICCV 2025
5
citations
GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data
NeurIPS 2025
3
citations
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
CVPR 2025
3
citations
Self-Supervised Scene De-Occlusion
CVPR 2020arXiv
0
citations
When NAS Meets Robustness: In Search of Robust Architectures Against Adversarial Attacks
CVPR 2020arXiv
0
citations
Online Deep Clustering for Unsupervised Representation Learning
CVPR 2020arXiv
0
citations
Rotate-and-Render: Unsupervised Photorealistic Face Rotation From Single-View Images
CVPR 2020
0
citations
MaskGAN: Towards Diverse and Interactive Facial Image Manipulation
CVPR 2020arXiv
0
citations
Open Compound Domain Adaptation
CVPR 2020arXiv
0
citations
Visually Informed Binaural Audio Generation without Binaural Audios
CVPR 2021arXiv
0
citations
Adversarial Robustness Under Long-Tailed Distribution
CVPR 2021arXiv
0
citations
Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination
CVPR 2021arXiv
0
citations
LiDAR-Based Panoptic Segmentation via Dynamic Shifting Network
CVPR 2021arXiv
0
citations
Seesaw Loss for Long-Tailed Instance Segmentation
CVPR 2021arXiv
0
citations
Variational Relational Point Completion Network
CVPR 2021arXiv
0
citations
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
CVPR 2021arXiv
0
citations
Deep Animation Video Interpolation in the Wild
CVPR 2021arXiv
0
citations
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
CVPR 2021arXiv
0
citations
Robust Reference-Based Super-Resolution via C2-Matching
CVPR 2021arXiv
0
citations
Delving Deep Into the Generalization of Vision Transformers Under Distribution Shifts
CVPR 2022arXiv
0
citations
Versatile Multi-Modal Pre-Training for Human-Centric Perception
CVPR 2022arXiv
0
citations
Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
CVPR 2022arXiv
0
citations
TCTrack: Temporal Contexts for Aerial Tracking
CVPR 2022arXiv
0
citations
Balanced MSE for Imbalanced Visual Regression
CVPR 2022arXiv
0
citations
Bailando: 3D Dance Generation by Actor-Critic GPT With Choreographic Memory
CVPR 2022arXiv
0
citations
Conditional Prompt Learning for Vision-Language Models
CVPR 2022arXiv
0
citations
Full-Range Virtual Try-On With Recurrent Tri-Level Transform
CVPR 2022
0
citations
Unsupervised Image-to-Image Translation With Generative Prior
CVPR 2022arXiv
0
citations
F2-NeRF: Fast Neural Radiance Field Training With Free Camera Trajectories
CVPR 2023
0
citations
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator
CVPR 2023arXiv
0
citations
LaserMix for Semi-Supervised LiDAR Semantic Segmentation
CVPR 2023arXiv
0
citations
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
CVPR 2023arXiv
0
citations
OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
CVPR 2023arXiv
0
citations
Panoptic Video Scene Graph Generation
CVPR 2023
0
citations
Detecting and Grounding Multi-Modal Media Manipulation
CVPR 2023arXiv
0
citations
Collaborative Diffusion for Multi-Modal Face Generation and Editing
CVPR 2023arXiv
0
citations
Semantic Image Segmentation via Deep Parsing Network
ICCV 2015
0
citations
Deep Learning Face Attributes in the Wild
ICCV 2015
0
citations
Video Frame Synthesis Using Deep Voxel Flow
ICCV 2017arXiv
0
citations
Vision-Infused Deep Audio Inpainting
ICCV 2019
0
citations
CARAFE: Content-Aware ReAssembly of FEatures
ICCV 2019
0
citations
Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild
ICCV 2019
0
citations
Unsupervised Domain Adaptive 3D Detection With Multi-Level Consistency
ICCV 2021arXiv
0
citations
Differentiable Dynamic Wirings for Neural Networks
ICCV 2021
0
citations
Talk-To-Edit: Fine-Grained Facial Editing via Dialog
ICCV 2021
0
citations
Incorporating Convolution Designs Into Visual Transformers
ICCV 2021arXiv
0
citations
Semantically Coherent Out-of-Distribution Detection
ICCV 2021arXiv
0
citations
BlockPlanner: City Block Generation With Vectorized Graph Representation
ICCV 2021
0
citations
Energy-Based Open-World Uncertainty Modeling for Confidence Calibration
ICCV 2021arXiv
0
citations
Deep Geometrized Cartoon Line Inbetweening
ICCV 2023
0
citations
Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing
ICCV 2023
0
citations
SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling
ICCV 2023arXiv
0
citations
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
ICCV 2023arXiv
0
citations
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-Centric Rendering
ICCV 2023
0
citations
SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis
ICCV 2023arXiv
0
citations
DeformToon3D: Deformable Neural Radiance Fields for 3D Toonification
ICCV 2023
0
citations
UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation
ICCV 2023arXiv
0
citations
StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
ICCV 2023arXiv
0
citations
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
ICCV 2023arXiv
0
citations
Rethinking Range View Representation for LiDAR Segmentation
ICCV 2023arXiv
0
citations
HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
CVPR 2025
0
citations
SHERF: Generalizable Human NeRF from a Single Image
ICCV 2023arXiv
0
citations
Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets
ECCV 2020
0
citations
CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations
ECCV 2020
0
citations
Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement
ECCV 2020
0
citations
Placepedia: Comprehensive Place Understanding with Multi-Faceted Annotations
ECCV 2020
0
citations
UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation
ECCV 2022
0
citations
HuMMan: Multi-modal 4D Human Dataset for Versatile Sensing and Modeling
ECCV 2022
0
citations
Benchmarking Omni-Vision Representation through the Lens of Visual Realms
ECCV 2022
0
citations
CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
ECCV 2022
0
citations
Detecting and Recovering Sequential DeepFake Manipulation
ECCV 2022
0
citations
Relighting4D: Neural Relightable Human from Videos
ECCV 2022
0
citations
StyleSwap: Style-Based Generator Empowers Robust Face Swapping
ECCV 2022
0
citations
Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis
ECCV 2022
0
citations
StyleLight: HDR Panorama Generation for Lighting Estimation and Editing
ECCV 2022
0
citations
StyleGAN-Human: A Data-Centric Odyssey of Human Generation
ECCV 2022
0
citations
X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation
ECCV 2022
0
citations
Panoptic Scene Graph Generation
ECCV 2022
0
citations
Mind the Gap in Distilling StyleGANs
ECCV 2022
0
citations
Text2Performer: Text-Driven Human Video Generation
ICCV 2023arXiv
0
citations
LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes
CVPR 2025
0
citations
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
CVPR 2025
0
citations
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
CVPR 2025
0
citations
Disco4D: Disentangled 4D Human Generation and Animation from a Single Image
CVPR 2025
0
citations
EgoLife: Towards Egocentric Life Assistant
CVPR 2025
0
citations
WildAvatar: Learning In-the-wild 3D Avatars from the Web
CVPR 2025
0
citations
GauUpdate: New Object Insertion in 3D Gaussian Fields with Consistent Global Illumination
ICCV 2025
0
citations
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
ICCV 2025
0
citations
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
ICCV 2025
0
citations
Dual-Expert Consistency Model for Efficient and High-Quality Video Generation
ICCV 2025
0
citations
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion
ICCV 2025
0
citations
Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding
ICCV 2025
0
citations
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
ICCV 2025
0
citations
DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior
ICCV 2025
0
citations
SIGMA: Selective Gated Mamba for Sequential Recommendation
AAAI 2025
0
citations
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
CVPR 2024
0
citations
URHand: Universal Relightable Hands
CVPR 2024
0
citations
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos
CVPR 2024
0
citations
SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering
CVPR 2024
0
citations
CityDreamer: Compositional Generative Model of Unbounded 3D Cities
CVPR 2024
0
citations
Vlogger: Make Your Dream A Vlog
CVPR 2024
0
citations
FreeU: Free Lunch in Diffusion U-Net
CVPR 2024
0
citations
Link-Context Learning for Multimodal LLMs
CVPR 2024
0
citations
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
CVPR 2024
0
citations
DeepFashion: Powering Robust Clothes Recognition and Retrieval With Rich Annotations
CVPR 2016
0
citations
Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade
CVPR 2017arXiv
0
citations
Self-Supervised Learning via Conditional Motion Propagation
CVPR 2019
0
citations
Large-Scale Long-Tailed Recognition in an Open World
CVPR 2019
0
citations
Hybrid Task Cascade for Instance Segmentation
CVPR 2019
0
citations
Few-Shot Object Detection via Association and DIscrimination
NeurIPS 2021
0
citations
Garment4D: Garment Reconstruction from Point Cloud Sequences
NeurIPS 2021
0
citations
Unsupervised Object-Level Representation Learning from Scene Images
NeurIPS 2021
0
citations
Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion
NeurIPS 2021
0
citations
AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies
NeurIPS 2022
0
citations
Audio-Driven Co-Speech Gesture Video Generation
NeurIPS 2022
0
citations
Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms
NeurIPS 2022
0
citations
OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
NeurIPS 2022
0
citations
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars
NeurIPS 2023
0
citations
SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation
NeurIPS 2023
0
citations
PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation
NeurIPS 2023
0
citations
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing
NeurIPS 2023
0
citations
Towards Robust and Expressive Whole-body Human Pose and Shape Estimation
NeurIPS 2023
0
citations
What Makes Good Examples for Visual In-Context Learning?
NeurIPS 2023
0
citations
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models
NeurIPS 2023
0
citations
InsActor: Instruction-driven Physics-based Characters
NeurIPS 2023
0
citations
4D Panoptic Scene Graph Generation
NeurIPS 2023
0
citations
Large Language Models are Visual Reasoning Coordinators
NeurIPS 2023
0
citations