2025 "pixel-level grounding" Papers
2 papers found
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels
Yongshuo Zong, Qin ZHANG, DONGSHENG An et al.
CVPR 2025posterarXiv:2505.13788
3
citations
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Hao Zhong, Muzhi Zhu, Zongze Du et al.
NEURIPS 2025oralarXiv:2505.20256
12
citations