CVPR 2025 "visual grounding" Papers
4 papers found
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
Jiansheng Li, Xingxuan Zhang, Hao Zou et al.
CVPR 2025highlightarXiv:2504.10158
1
citations
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu, Sheng Jin, Wenwei Zhang et al.
CVPR 2025posterarXiv:2406.05821
21
citations
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.
CVPR 2025highlightarXiv:2504.02823
2
citations
Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
seil kang, Jinyeong Kim, Junhyeok Kim et al.
CVPR 2025highlightarXiv:2503.06287
31
citations