2025 "visual reasoning" Papers
16 papers found
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao, Xuqi Liu, Zhongqi Yue et al.
ICCV 2025posterarXiv:2504.06606
10
citations
CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning
Ji Qi, Ming Ding, Weihan Wang et al.
ICLR 2025posterarXiv:2402.04236
33
citations
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue, Zhiqi Chen, Rui Lu et al.
NeurIPS 2025oralarXiv:2504.13837
483
citations
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Fucai Ke, Vijay Kumar b g, Xingjian Leng et al.
ICCV 2025posterarXiv:2503.19263
6
citations
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai, Yuxuan Fan, Qiu Jiantao et al.
NeurIPS 2025posterarXiv:2506.07227
2
citations
Latent Chain-of-Thought for Visual Reasoning
Guohao Sun, Hang Hua, Jian Wang et al.
NeurIPS 2025posterarXiv:2510.23925
7
citations
Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning
Oleh Kolner, Thomas Ortner, Stanisław Woźniak et al.
ICLR 2025posterarXiv:2409.20213
Neurosymbolic Diffusion Models
Emile van Krieken, Pasquale Minervini, Edoardo Maria Ponti et al.
NeurIPS 2025posterarXiv:2505.13138
3
citations
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Yana Wei, Liang Zhao, Jianjian Sun et al.
NeurIPS 2025posterarXiv:2507.05255
14
citations
OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles
Yihe Deng, Hritik Bansal, Fan Yin et al.
NeurIPS 2025posterarXiv:2503.17352
15
citations
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
Ye Liu, Zongyang Ma, Junfu Pu et al.
NeurIPS 2025posterarXiv:2509.18094
4
citations
VideoAds for Fast-Paced Video Understanding
Zheyuan Zhang, Wanying Dou, Linkai Peng et al.
ICCV 2025posterarXiv:2504.09282
2
citations
VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
Li Kang, Xiufeng Song, Heng Zhou et al.
NeurIPS 2025posterarXiv:2506.09049
8
citations
Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting
Anand Bhattad, Konpat Preechakul, Alexei Efros
NeurIPS 2025posterarXiv:2503.21770
8
citations
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
Tianhe Wu, Jian Zou, Jie Liang et al.
NeurIPS 2025spotlightarXiv:2505.14460
30
citations
Visual Structures Help Visual Reasoning: Addressing the Binding Problem in LVLMs
Amirmohammad Izadi, Mohammadali Banayeeanzade, Fatemeh Askari et al.
NeurIPS 2025poster
1
citations