Shaofei Huang
5
Papers
0
Total Citations
Papers (5)
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
CVPR 2025
0
citations
Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
CVPR 2025
0
citations
Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization
ICCV 2025
0
citations
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
AAAI 2025
0
citations
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
CVPR 2024
0
citations