Highlight "visual grounding" Papers
4 papers found
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
Jiansheng Li, Xingxuan Zhang, Hao Zou et al.
CVPR 2025highlightarXiv:2504.10158
1
citations
Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation
ZIYU ZHU, Xilin Wang, Yixuan Li et al.
ICCV 2025highlightarXiv:2507.04047
24
citations
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.
CVPR 2025highlightarXiv:2504.02823
2
citations
Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
seil kang, Jinyeong Kim, Junhyeok Kim et al.
CVPR 2025highlightarXiv:2503.06287
31
citations