Di Hu
9
Papers
119
Total Citations
Papers (9)
Enhancing Multimodal Cooperation via Sample-level Modality Valuation
CVPR 2024
51
citations
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
AAAI 2024arXiv
38
citations
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
ECCV 2024
23
citations
Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition
CVPR 2025
7
citations
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
CVPR 2025
0
citations
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
ICML 2024
0
citations
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
CVPR 2025
0
citations
Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction
CVPR 2025
0
citations
MokA: Multimodal Low-Rank Adaptation for MLLMs
NeurIPS 2025
0
citations