Hao Li
125
Papers
654
Total Citations
Papers (125)
Training Quantized Nets: A Deeper Understanding
NeurIPS 2017arXiv
222
citations
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
ICLR 2024
118
citations
GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing
NeurIPS 2025
60
citations
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
CVPR 2025
34
citations
Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition
AAAI 2024arXiv
33
citations
Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction
CVPR 2024
32
citations
VOODOO 3D: Volumetric Portrait Disentanglement For One-Shot 3D Head Reenactment
CVPR 2024
29
citations
GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding
CVPR 2024
28
citations
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
ICCV 2025
17
citations
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations
NeurIPS 2025
17
citations
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
CVPR 2025arXiv
15
citations
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
ECCV 2024
13
citations
GRPose: Learning Graph Relations for Human Image Generation with Pose Priors
AAAI 2025
10
citations
GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation
CVPR 2025
5
citations
GIFStream: 4D Gaussian-based Immersive Video with Feature Stream
CVPR 2025
4
citations
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
AAAI 2025
4
citations
Pioneer: Physics-informed Riemannian Graph ODE for Entropy-increasing Dynamics
AAAI 2025
4
citations
Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding
CVPR 2025
3
citations
Political Actor Agent: Simulating Legislative System for Roll Call Votes Prediction with Large Language Models
AAAI 2025
2
citations
Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation
CVPR 2025
2
citations
TMetaNet: Topological Meta-Learning Framework for Dynamic Link Prediction
ICML 2025
1
citations
STRIDER: Navigation via Instruction-Aligned Structural Decision Space Optimization
NeurIPS 2025
1
citations
NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation
CVPR 2024
0
citations
LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content
CVPR 2024
0
citations
Diffusion-based Blind Text Image Super-Resolution
CVPR 2024
0
citations
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
ICML 2024
0
citations
PointMC: Multi-instance Point Cloud Registration based on Maximal Cliques
ICML 2024
0
citations
Unconstrained Realtime Facial Performance Capture
CVPR 2015
0
citations
Dense Human Body Correspondences Using Convolutional Networks
CVPR 2016
0
citations
Photorealistic Facial Texture Inference Using Deep Neural Networks
CVPR 2017arXiv
0
citations
High-Resolution Image Inpainting Using Multi-Scale Neural Patch Synthesis
CVPR 2017arXiv
0
citations
DoubleFusion: Real-Time Capture of Human Performances With Inner Body Shapes From a Single Depth Sensor
CVPR 2018arXiv
0
citations
Mesoscopic Facial Geometry Inference Using Deep Neural Networks
CVPR 2018
0
citations
Large-Scale Distance Metric Learning With Uncertainty
CVPR 2018arXiv
0
citations
SiCloPe: Silhouette-Based Clothed People
CVPR 2019
0
citations
On the Continuity of Rotation Representations in Neural Networks
CVPR 2019
0
citations
ARCH: Animatable Reconstruction of Clothed Humans
CVPR 2020arXiv
0
citations
Learning Formation of Physically-Based Face Attributes
CVPR 2020arXiv
0
citations
Hierarchically Robust Representation Learning
CVPR 2020arXiv
0
citations
Intuitive, Interactive Beard and Hair Synthesis With Generative Models
CVPR 2020arXiv
0
citations
DR Loss: Improving Object Detection by Distributional Ranking
CVPR 2020arXiv
0
citations
Robust Representation Learning With Feedback for Single Image Deraining
CVPR 2021arXiv
0
citations
Equivariant Point Network for 3D Point Cloud Analysis
CVPR 2021arXiv
0
citations
Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement
CVPR 2021arXiv
0
citations
Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework
CVPR 2021
0
citations
SKFAC: Training Neural Networks With Faster Kronecker-Factored Approximate Curvature
CVPR 2021
0
citations
Uni-Perceiver: Pre-Training Unified Architecture for Generic Perception for Zero-Shot and Few-Shot Tasks
CVPR 2022
0
citations
Unsupervised Visual Representation Learning by Online Constrained K-Means
CVPR 2022
0
citations
EPro-PnP: Generalized End-to-End Probabilistic Perspective-N-Points for Monocular Object Pose Estimation
CVPR 2022
0
citations
Task Adaptive Parameter Sharing for Multi-Task Learning
CVPR 2022arXiv
0
citations
MogFace: Towards a Deeper Appreciation on Face Detection
CVPR 2022arXiv
0
citations
AutoLoss-Zero: Searching Loss Functions From Scratch for Generic Tasks
CVPR 2022
0
citations
Learning To Listen: Modeling Non-Deterministic Dyadic Facial Motion
CVPR 2022arXiv
0
citations
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-Based Motion Recognition
CVPR 2022
0
citations
An Efficient Training Approach for Very Large Scale Face Recognition
CVPR 2022arXiv
0
citations
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
CVPR 2023arXiv
0
citations
StyleGene: Crossover and Mutation of Region-Level Facial Genes for Kinship Face Synthesis
CVPR 2023
0
citations
Learning a Sparse Transformer Network for Effective Image Deraining
CVPR 2023arXiv
0
citations
SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory
CVPR 2023arXiv
0
citations
The ObjectFolder Benchmark: Multisensory Learning With Neural and Real Objects
CVPR 2023
0
citations
Guided Recommendation for Model Fine-Tuning
CVPR 2023
0
citations
Boosting Low-Data Instance Segmentation by Unsupervised Pre-Training With Saliency Prompt
CVPR 2023arXiv
0
citations
Learning Dense Facial Correspondences in Unconstrained Images
ICCV 2017arXiv
0
citations
Realistic Dynamic Facial Textures From a Single Image Using GANs
ICCV 2017
0
citations
Improved Techniques for Training Adaptive Deep Networks
ICCV 2019
0
citations
PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization
ICCV 2019
0
citations
SoftTriple Loss: Deep Metric Learning Without Triplet Sampling
ICCV 2019
0
citations
Transformable Bottleneck Networks
ICCV 2019
0
citations
Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning
ICCV 2019
0
citations
Learning Perspective Undistortion of Portraits
ICCV 2019
0
citations
Learning to Rank Proposals for Object Detection
ICCV 2019
0
citations
PlenOctrees for Real-Time Rendering of Neural Radiance Fields
ICCV 2021arXiv
0
citations
TransReID: Transformer-Based Object Re-Identification
ICCV 2021arXiv
0
citations
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World
CVPR 2025
0
citations
Digging Into Uncertainty in Self-Supervised Multi-View Stereo
ICCV 2021arXiv
0
citations
Zen-NAS: A Zero-Shot NAS for High-Performance Image Recognition
ICCV 2021
0
citations
DisUnknown: Distilling Unknown Factors for Disentanglement Learning
ICCV 2021arXiv
0
citations
A Simple Baseline for Semi-Supervised Semantic Segmentation With Strong Data Augmentation
ICCV 2021arXiv
0
citations
Topologically Consistent Multi-View Face Inference Using Volumetric Sampling
ICCV 2021arXiv
0
citations
MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection
ICCV 2023arXiv
0
citations
NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space
ICCV 2023
0
citations
XMem++: Production-level Video Segmentation From Few Annotated Frames
ICCV 2023
0
citations
Video Action Recognition with Attentive Semantic Units
ICCV 2023arXiv
0
citations
Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds
ICCV 2023
0
citations
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
ICCV 2023arXiv
0
citations
Monocular Real-Time Volumetric Performance Capture
ECCV 2020
0
citations
Enhancing Multi-modal Features Using Local Self-Attention for 3D Object Detection
ECCV 2022
0
citations
Unstructured Feature Decoupling for Vehicle Re-identification
ECCV 2022
0
citations
DLME: Deep Local-Flatness Manifold Embedding
ECCV 2022
0
citations
KVT: k-NN Attention for Boosting Vision Transformers
ECCV 2022
0
citations
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation
ECCV 2022
0
citations
Weakly Supervised Representation Learning With Coarse Labels
ICCV 2021arXiv
0
citations
DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
CVPR 2025
0
citations
CCIN: Compositional Conflict Identification and Neutralization for Composed Image Retrieval
CVPR 2025
0
citations
Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
CVPR 2025
0
citations
FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling
ICCV 2025
0
citations
DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation
ICCV 2025
0
citations
FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
ICCV 2025
0
citations
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
ICCV 2025
0
citations
LangBridge: Interpreting Image as a Combination of Language Embeddings
ICCV 2025
0
citations
CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction
ICCV 2025
0
citations
Cross-Category Subjectivity Generalization for Style-Adaptive Sketch Re-ID
ICCV 2025
0
citations
QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation
ICCV 2025
0
citations
AU-Blendshape for Fine-grained Stylized 3D Facial Expression Manipulation
ICCV 2025
0
citations
Deconfound Semantic Shift and Incompleteness in Incremental Few-shot Semantic Segmentation
AAAI 2025
0
citations
MUCD: Unsupervised Point Cloud Change Detection via Masked Consistency
AAAI 2025
0
citations
HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models
AAAI 2025
0
citations
Partial Point Cloud Registration with Multi-view 2D Image Learning
AAAI 2025
0
citations
AdvDisplay: Adversarial Display Assembled by Thermoelectric Cooler for Fooling Thermal Infrared Detectors
AAAI 2025
0
citations
Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing
AAAI 2024
0
citations
Robustly Train Normalizing Flows via KL Divergence Regularization
AAAI 2024
0
citations
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
CVPR 2024
0
citations
On the Scalability of Diffusion-based Text-to-Image Generation
CVPR 2024
0
citations
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
NeurIPS 2018
0
citations
Visualizing the Loss Landscape of Neural Nets
NeurIPS 2018
0
citations
Learning to Infer Implicit Surfaces without 3D Supervision
NeurIPS 2019
0
citations
Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
NeurIPS 2020
0
citations
HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning
NeurIPS 2021
0
citations
A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval
NeurIPS 2022
0
citations
VTC-LFC: Vision Transformer Compression with Low-Frequency Components
NeurIPS 2022
0
citations
Entropy-Driven Mixed-Precision Quantization for Deep Network Design
NeurIPS 2022
0
citations
Improved Fine-Tuning by Better Leveraging Pre-Training Data
NeurIPS 2022
0
citations
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval
NeurIPS 2023
0
citations
JourneyDB: A Benchmark for Generative Image Understanding
NeurIPS 2023
0
citations
Adaptive Consensus ADMM for Distributed Optimization
ICML 2017
0
citations