Hang Xu
21
Papers
430
Total Citations
Papers (21)
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
ICLR 2025
169
citations
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
CVPR 2024
45
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
44
citations
Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis
ICLR 2024
44
citations
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance
ICCV 2025
43
citations
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
ECCV 2024arXiv
14
citations
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
CVPR 2025
13
citations
FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors
ICCV 2025
12
citations
Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution
CVPR 2024
11
citations
LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement
AAAI 2024arXiv
10
citations
TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields
ICLR 2024
9
citations
ACE: Anti-Editing Concept Erasure in Text-to-Image Models
CVPR 2025
8
citations
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
CVPR 2025
8
citations
Rethinking Boundary Discontinuity Problem for Oriented Object Detection
CVPR 2024
0
citations
VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning
ICCV 2025
0
citations
Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution
CVPR 2025
0
citations
FreeDNA: Endowing Domain Adaptation of Diffusion-Based Dense Prediction with Training-Free Domain Noise Alignment
ICCV 2025
0
citations
DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior
CVPR 2024
0
citations
Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models
CVPR 2024
0
citations
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
CVPR 2024
0
citations
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
AAAI 2024
0
citations