"visual perception" Papers
3 papers found
Are Large Vision Language Models Good Game Players?
Xinyu Wang, Bohan Zhuang, Qi Wu
ICLR 2025posterarXiv:2503.02358
13
citations
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi, Fuxiao Liu, Shihao Wang et al.
ICLR 2025posterarXiv:2408.15998
116
citations
ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models
Rohan Wadhawan, Hritik Bansal, Kai-Wei Chang et al.
ICML 2024poster