2025 "spatial reasoning" Papers
16 papers found
ChatVLA-2: Vision-Language-Action Model with Open-World Reasoning
Zhongyi Zhou, Yichen Zhu, Xiaoyu Liu et al.
NeurIPS 2025poster
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Xinhao Liu, Jintong Li, Yicheng Jiang et al.
CVPR 2025posterarXiv:2411.17820
25
citations
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning
Xingjian Ran, Yixuan Li, Linning Xu et al.
NeurIPS 2025posterarXiv:2506.05341
5
citations
Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
Jitesh Jain, Zhengyuan Yang, Humphrey Shi et al.
NeurIPS 2025posterarXiv:2412.09585
4
citations
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
Yifan Shen, Yuanzhe Liu, Jingyuan Zhu et al.
NeurIPS 2025posterarXiv:2506.21656
3
citations
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
Tianxu Wang, Zhuofan Zhang, Ziyu Zhu et al.
NeurIPS 2025posterarXiv:2506.04897
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
Duo Zheng, shijia Huang, Yanyang Li et al.
NeurIPS 2025posterarXiv:2505.24625
24
citations
Locality Alignment Improves Vision-Language Models
Ian Covert, Tony Sun, James Y Zou et al.
ICLR 2025posterarXiv:2410.11087
ORIGAMISPACE: Benchmarking Multimodal LLMs in Multi-Step Spatial Reasoning with Mathematical Constraints
Rui Xu, Dakuan Lu, Zicheng Zhao et al.
NeurIPS 2025spotlightarXiv:2511.18450
2
citations
Re-Thinking Inverse Graphics With Large Language Models
Haiwen Feng, Michael J Black, Weiyang Liu et al.
ICLR 2025posterarXiv:2404.15228
15
citations
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
Dongyoung Kim, Huiwon Jang, Sumin Park et al.
NeurIPS 2025posterarXiv:2506.00070
9
citations
Robust Cross-modal Alignment Learning for Cross-Scene Spatial Reasoning and Grounding
Yanglin Feng, Hongyuan Zhu, Dezhong Peng et al.
NeurIPS 2025poster
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Zekun Qi, Wenyao Zhang, Yufei Ding et al.
NeurIPS 2025spotlightarXiv:2502.13143
33
citations
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Model
Yue Zhang, Zhiyang Xu, Ying Shen et al.
ICLR 2025posterarXiv:2410.03878
19
citations
Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs
Fangrui Zhu, Hanhui Wang, Yiming Xie et al.
NeurIPS 2025posterarXiv:2506.04220
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction
Junyi Chen, Di Huang, Weicai Ye et al.
ICLR 2025posterarXiv:2410.18962
4
citations