Gao Huang
23
Papers
228
Total Citations
Papers (23)
GSVA: Generalized Segmentation via Multimodal Large Language Models
CVPR 2024
127
citations
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
CVPR 2024
28
citations
DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
ECCV 2024
21
citations
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
CVPR 2025
20
citations
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
ECCV 2024
15
citations
Video Perception Models for 3D Scene Synthesis
NeurIPS 2025
5
citations
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
CVPR 2025
5
citations
GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling
ICLR 2025
4
citations
DTOS: Dynamic Time Object Sensing with Large Multimodal Model
CVPR 2025
2
citations
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
ICCV 2025arXiv
1
citations
Prompt-Free Diffusion: Taking “Text” out of Text-to-Image Diffusion Models
CVPR 2024
0
citations
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
CVPR 2024
0
citations
ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding
CVPR 2025
0
citations
SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
ICML 2024
0
citations
Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
CVPR 2025arXiv
0
citations
EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance
CVPR 2025
0
citations
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
CVPR 2025
0
citations
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
CVPR 2025
0
citations
CODA: Repurposing Continuous VAEs for Discrete Tokenization
ICCV 2025
0
citations
DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints
AAAI 2025
0
citations
ExpeL: LLM Agents Are Experiential Learners
AAAI 2024
0
citations
Exploring Temporal Feature Correlation for Efficient and Stable Video Semantic Segmentation
AAAI 2024
0
citations
Mask Grounding for Referring Image Segmentation
CVPR 2024
0
citations