Zheng Zhu
40
Papers
229
Total Citations
Papers (40)
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
CVPR 2025
83
citations
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
CVPR 2025
54
citations
DiffBEV: Conditional Diffusion Model for Bird’s Eye View Perception
AAAI 2024arXiv
36
citations
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Videos Generation
NeurIPS 2025arXiv
25
citations
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
ICCV 2025
22
citations
One at a Time: Progressive Multi-Step Volumetric Probability Learning for Reliable 3D Scene Perception
AAAI 2024arXiv
9
citations
End-to-End Flow Correlation Tracking With Spatial-Temporal Attention
CVPR 2018arXiv
0
citations
High Performance Visual Tracking With Siamese Region Proposal Network
CVPR 2018
0
citations
Attention-Guided Unified Network for Panoptic Segmentation
CVPR 2019
0
citations
The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation
CVPR 2020arXiv
0
citations
Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes
CVPR 2021
0
citations
WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
CVPR 2021arXiv
0
citations
Decoupled Multi-Task Learning With Cyclical Self-Regulation for Face Parsing
CVPR 2022arXiv
0
citations
CAFE: Learning To Condense Dataset by Aligning Features
CVPR 2022arXiv
0
citations
Dimension Embeddings for Monocular 3D Object Detection
CVPR 2022
0
citations
Crafting Better Contrastive Views for Siamese Representation Learning
CVPR 2022arXiv
0
citations
DenseCLIP: Language-Guided Dense Prediction With Context-Aware Prompting
CVPR 2022arXiv
0
citations
An Efficient Training Approach for Very Large Scale Face Recognition
CVPR 2022arXiv
0
citations
Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search
CVPR 2022
0
citations
Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark
CVPR 2023arXiv
0
citations
CompletionFormer: Depth Completion With Convolutions and Vision Transformers
CVPR 2023arXiv
0
citations
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
CVPR 2023arXiv
0
citations
Gait Recognition in the Wild: A Benchmark
ICCV 2021
0
citations
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
ICCV 2023arXiv
0
citations
OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction
ICCV 2023arXiv
0
citations
Token-Label Alignment for Vision Transformers
ICCV 2023arXiv
0
citations
DREAM: Efficient Dataset Distillation by Representative Matching
ICCV 2023arXiv
0
citations
DyGait: Exploiting Dynamic Representations for High-performance Gait Recognition
ICCV 2023arXiv
0
citations
OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
ICCV 2023arXiv
0
citations
SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving
ICCV 2023arXiv
0
citations
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
ECCV 2022
0
citations
MVSTER: Epipolar Transformer for Efficient Multi-View Stereo
ECCV 2022
0
citations
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
CVPR 2025
0
citations
JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems
CVPR 2025
0
citations
WonderTurbo: Generating Interactive 3D World in 0.72 Seconds
ICCV 2025
0
citations
DetRF: Detachable Novel Views Synthesis of Dynamic Scenes Using Backdrop-Driven Neural Radiance Fields
AAAI 2025
0
citations
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation
AAAI 2025
0
citations
DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving
CVPR 2024
0
citations
Global Filter Networks for Image Classification
NeurIPS 2021
0
citations
OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
NeurIPS 2022
0
citations