ICLR 2025 "multimodal reasoning" Papers
4 papers found
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Kim Sung-Bin, Oh Hyun-Bin, Lee Jung-Mok et al.
ICLR 2025posterarXiv:2410.18325
17
citations
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities
Zheyuan Zhang, Fengyuan Hu, Jayjun Lee et al.
ICLR 2025posterarXiv:2410.17385
40
citations
NL-Eye: Abductive NLI For Images
Mor Ventura, Michael Toker, Nitay Calderon et al.
ICLR 2025posterarXiv:2410.02613
3
citations
Temporal Reasoning Transfer from Text to Video
Lei Li, Yuanxin Liu, Linli Yao et al.
ICLR 2025oralarXiv:2410.06166
20
citations