"referring expression comprehension" Papers
6 papers found
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Miran Heo, Min-Hung Chen, De-An Huang et al.
CVPR 2025posterarXiv:2501.08326
9
citations
Referring to Any Person
Qing Jiang, Lin Wu, Zhaoyang Zeng et al.
ICCV 2025posterarXiv:2503.08507
13
citations
Vision-Language Models Can't See the Obvious
YASSER ABDELAZIZ DAHOU DJILALI, Ngoc Huynh, Phúc Lê Khắc et al.
ICCV 2025posterarXiv:2507.04741
7
citations
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM
Yixuan Wu, Yizhou Wang, Shixiang Tang et al.
ECCV 2024posterarXiv:2403.12488
47
citations
GroundVLP: Harnessing Zero-Shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Haozhan Shen, Tiancheng Zhao, Mingwei Zhu et al.
AAAI 2024paperarXiv:2312.15043
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
Hao Zhang, Hongyang Li, Feng Li et al.
ECCV 2024posterarXiv:2312.02949
114
citations