2025 Highlight "vision-language models" Papers
9 papers found
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
Shouwei Ruan, Hanqing Liu, Yao Huang et al.
ICCV 2025highlightarXiv:2412.03002
2
citations
Context-Aware Multimodal Pretraining
Karsten Roth, Zeynep Akata, Dima Damen et al.
CVPR 2025highlightarXiv:2411.15099
4
citations
DistinctAD: Distinctive Audio Description Generation in Contexts
Bo Fang, Wenhao Wu, Qiangqiang Wu et al.
CVPR 2025highlightarXiv:2411.18180
4
citations
Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding
Tianyu Chen, Xingcheng Fu, Yisen Gao et al.
CVPR 2025highlightarXiv:2503.18578
2
citations
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Davide Berasi, Matteo Farina, Massimiliano Mancini et al.
CVPR 2025highlightarXiv:2503.17142
3
citations
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Wei Suo, Lijun Zhang, Mengyang Sun et al.
CVPR 2025highlightarXiv:2503.00361
15
citations
Realistic Test-Time Adaptation of Vision-Language Models
Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer et al.
CVPR 2025highlightarXiv:2501.03729
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.
CVPR 2025highlightarXiv:2504.02823
2
citations
Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
seil kang, Jinyeong Kim, Junhyeok Kim et al.
CVPR 2025highlightarXiv:2503.06287
31
citations