Shijia Huang
4
Papers
255
Total Citations
Papers (4)
Towards Learning a Generalist Model for Embodied Navigation
CVPR 2024
117
citations
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
ECCV 2024arXiv
114
citations
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
NeurIPS 2025arXiv
24
citations
Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding
CVPR 2025
0
citations