ICLR 2025 "multimodal tasks" Papers
5 papers found
DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models
Kaishen Wang, Hengrui Gu, Meijun Gao et al.
ICLR 2025poster
7
citations
Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?
Letitia Parcalabescu, Anette Frank
ICLR 2025posterarXiv:2404.18624
20
citations
How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Seongyun Lee, Geewook Kim, Jiyeon Kim et al.
ICLR 2025posterarXiv:2410.07571
4
citations
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Congpei Qiu, Yanhao Wu, Wei Ke et al.
ICLR 2025posterarXiv:2504.02328
6
citations
Teaching Human Behavior Improves Content Understanding Abilities Of VLMs
SOMESH SINGH, Harini S I, Yaman Singla et al.
ICLR 2025poster
2
citations