Yifan Xu
5
Papers
261
Total Citations
Papers (5)
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
AAAI 2024arXiv
190
citations
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
ICLR 2025
67
citations
Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning
CVPR 2025
4
citations
MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation
AAAI 2025
0
citations
Libra: Building Decoupled Vision System on Large Language Models
ICML 2024
0
citations