2025 "multi-modal reasoning" Papers
7 papers found
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
Shaoyuan Xie, Lingdong Kong, Yuhao Dong et al.
ICCV 2025posterarXiv:2501.04003
71
citations
AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.
ICCV 2025posterarXiv:2501.02135
9
citations
DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
Yue Jiang, Jichu Li, Yang Liu et al.
NEURIPS 2025oralarXiv:2505.18411
3
citations
EvolvedGRPO: Unlocking Reasoning in LVLMs via Progressive Instruction Evolution
Zhebei Shen, Qifan Yu, Juncheng Li et al.
NEURIPS 2025poster
InstructHOI: Context-Aware Instruction for Multi-Modal Reasoning in Human-Object Interaction Detection
Jinguo Luo, Weihong Ren, Quanlong Zheng et al.
NEURIPS 2025spotlight
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
Wei Suo, Lijun Zhang, Mengyang Sun et al.
CVPR 2025highlightarXiv:2503.00361
15
citations
Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning
Minheng Ni, YuTao Fan, Lei Zhang et al.
ICLR 2025posterarXiv:2410.03321
20
citations