Hengshuang Zhao
31
Papers
527
Total Citations
Papers (31)
Pixel-GS Density Control with Pixel-aware Gradient for 3D Gaussian Splatting
ECCV 2024
96
citations
Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
CVPR 2024
77
citations
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics
CVPR 2025
70
citations
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
CVPR 2024
62
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
44
citations
Sonata: Self-Supervised Learning of Reliable Point Representations
CVPR 2025
39
citations
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
ICML 2025
28
citations
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding
CVPR 2024
25
citations
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
CVPR 2024
19
citations
DriveGPT4-V2: Harnessing Large Language Model Capabilities for Enhanced Closed-Loop Autonomous Driving
CVPR 2025
17
citations
ViLLa: Video Reasoning Segmentation with Large Language Model
ICCV 2025
16
citations
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
CVPR 2025
13
citations
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding
ICML 2025
6
citations
ROSE: Remove Objects with Side Effects in Videos
NeurIPS 2025
4
citations
Empowering Large Language Models with 3D Situation Awareness
CVPR 2025
3
citations
PlayerOne: Egocentric World Simulator
NeurIPS 2025
3
citations
LiteReality: Graphic-Ready 3D Scene Reconstruction from RGB-D Scans
NeurIPS 2025
3
citations
BOOD: Boundary-based Out-Of-Distribution Data Generation
ICML 2025
2
citations
UniMODE: Unified Monocular 3D Object Detection
CVPR 2024
0
citations
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
CVPR 2024
0
citations
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
CVPR 2024
0
citations
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
ICCV 2025
0
citations
StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth
ICCV 2025
0
citations
DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs
ICCV 2025
0
citations
HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation
ICCV 2025
0
citations
SpatialCLIP: Learning 3D-aware Image Representations from Spatially Discriminative Language
CVPR 2025
0
citations
AnyDoor: Zero-shot Object-level Image Customization
CVPR 2024
0
citations
PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation
CVPR 2025
0
citations
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
CVPR 2024
0
citations
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
CVPR 2024
0
citations
Point Transformer V3: Simpler Faster Stronger
CVPR 2024
0
citations