2025 "spatial understanding" Papers
2 papers found
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Fanqing Meng, Jin Wang, Chuanhao Li et al.
ICLR 2025posterarXiv:2408.02718
48
citations
See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model
Pengteng Li, Pinhao Song, Wuyang Li et al.
NeurIPS 2025oralarXiv:2509.16087
1
citations