Hengshuang Zhao
65
Papers
527
Total Citations
Papers (65)
Pixel-GS Density Control with Pixel-aware Gradient for 3D Gaussian Splatting
ECCV 2024
96
citations
Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
CVPR 2024
77
citations
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics
CVPR 2025
70
citations
GPT4Point: A Unified Framework for Point-Language Understanding and Generation
CVPR 2024
62
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
44
citations
Sonata: Self-Supervised Learning of Reliable Point Representations
CVPR 2025
39
citations
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
ICML 2025
28
citations
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding
CVPR 2024
25
citations
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
CVPR 2024
19
citations
DriveGPT4-V2: Harnessing Large Language Model Capabilities for Enhanced Closed-Loop Autonomous Driving
CVPR 2025
17
citations
ViLLa: Video Reasoning Segmentation with Large Language Model
ICCV 2025
16
citations
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
CVPR 2025
13
citations
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding
ICML 2025
6
citations
ROSE: Remove Objects with Side Effects in Videos
NeurIPS 2025
4
citations
LiteReality: Graphic-Ready 3D Scene Reconstruction from RGB-D Scans
NeurIPS 2025
3
citations
Empowering Large Language Models with 3D Situation Awareness
CVPR 2025
3
citations
PlayerOne: Egocentric World Simulator
NeurIPS 2025
3
citations
BOOD: Boundary-based Out-Of-Distribution Data Generation
ICML 2025
2
citations
Exploring Self-Attention for Image Recognition
CVPR 2020arXiv
0
citations
Distilling Knowledge via Knowledge Review
CVPR 2021arXiv
0
citations
Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency
CVPR 2021
0
citations
PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds
CVPR 2021arXiv
0
citations
Fully Convolutional Networks for Panoptic Segmentation
CVPR 2021arXiv
0
citations
Bidirectional Projection Network for Cross Dimension Scene Understanding
CVPR 2021arXiv
0
citations
Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers
CVPR 2021arXiv
0
citations
FocalClick: Towards Practical Interactive Image Segmentation
CVPR 2022arXiv
0
citations
Generalized Few-Shot Semantic Segmentation
CVPR 2022arXiv
0
citations
PhysFormer: Facial Video-Based Physiological Measurement With Temporal Difference Transformer
CVPR 2022arXiv
0
citations
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
CVPR 2022arXiv
0
citations
Stratified Transformer for 3D Point Cloud Segmentation
CVPR 2022arXiv
0
citations
Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
CVPR 2023arXiv
0
citations
Detecting Everything in the Open World: Towards Universal Object Detection
CVPR 2023arXiv
0
citations
Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation
ICCV 2019
0
citations
Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation
ICCV 2021arXiv
0
citations
Point Transformer
ICCV 2021arXiv
0
citations
Open-vocabulary Panoptic Segmentation with Embedding Modulation
ICCV 2023arXiv
0
citations
Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning
ICCV 2023arXiv
0
citations
BT^2: Backward-compatible Training with Basis Transformation
ICCV 2023
0
citations
MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning
ECCV 2022
0
citations
SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness
ECCV 2022
0
citations
DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation
ECCV 2022
0
citations
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
CVPR 2023
0
citations
SpatialCLIP: Learning 3D-aware Image Representations from Spatially Discriminative Language
CVPR 2025
0
citations
PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation
CVPR 2025
0
citations
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
ICCV 2025
0
citations
StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth
ICCV 2025
0
citations
DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs
ICCV 2025
0
citations
HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation
ICCV 2025
0
citations
AnyDoor: Zero-shot Object-level Image Customization
CVPR 2024
0
citations
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
CVPR 2024
0
citations
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
CVPR 2024
0
citations
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
CVPR 2024
0
citations
Point Transformer V3: Simpler Faster Stronger
CVPR 2024
0
citations
UniMODE: Unified Monocular 3D Object Detection
CVPR 2024
0
citations
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
CVPR 2024
0
citations
Pyramid Scene Parsing Network
CVPR 2017arXiv
0
citations
PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing
CVPR 2019
0
citations
UPSNet: A Unified Panoptic Segmentation Network
CVPR 2019
0
citations
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
CVPR 2020arXiv
0
citations
Do Different Tracking Tasks Require Different Appearance Models?
NeurIPS 2021arXiv
0
citations
Point Transformer V2: Grouped Vector Attention and Partition-based Pooling
NeurIPS 2022
0
citations
FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
NeurIPS 2023
0
citations
TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation
NeurIPS 2023
0
citations
Uni3DETR: Unified 3D Detection Transformer
NeurIPS 2023
0
citations
CorresNeRF: Image Correspondence Priors for Neural Radiance Fields
NeurIPS 2023
0
citations