2025 Poster "visual grounding" Papers
9 papers found
Acknowledging Focus Ambiguity in Visual Questions
Chongyan Chen, Yu-Yun Tseng, Zhuoheng Li et al.
ICCV 2025posterarXiv:2501.02201
2
citations
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
Shaoyuan Xie, Lingdong Kong, Yuhao Dong et al.
ICCV 2025posterarXiv:2501.04003
71
citations
Controlling Multimodal LLMs via Reward-guided Decoding
Oscar Mañas, Pierluca D'Oro, Koustuv Sinha et al.
ICCV 2025posterarXiv:2508.11616
DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models
Kaishen Wang, Hengrui Gu, Meijun Gao et al.
ICLR 2025poster
7
citations
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu, Sheng Jin, Wenwei Zhang et al.
CVPR 2025posterarXiv:2406.05821
21
citations
Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs
Hao Fang, Changle Zhou, Jiawei Kong et al.
NeurIPS 2025posterarXiv:2505.19678
6
citations
MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
Xiaohu Huang, Jingjing Wu, Qunyi Xie et al.
NeurIPS 2025posterarXiv:2506.01946
17
citations
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
Cong Chen, Mingyu Liu, Chenchen Jing et al.
ICLR 2025posterarXiv:2503.06486
25
citations
Visually Consistent Hierarchical Image Classification
Seulki Park, Youren Zhang, Stella Yu et al.
ICLR 2025posterarXiv:2406.11608
4
citations