by Xiaojian (Shawn) Ma Papers
3 papers found
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
Tianxu Wang, Zhuofan Zhang, Ziyu Zhu et al.
NEURIPS 2025posterarXiv:2506.04897
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Pengxiang Li, Zhi Gao, Bofei Zhang et al.
NEURIPS 2025posterarXiv:2504.21561
NEP: Autoregressive Image Editing via Next Editing Token Prediction
Huimin Wu, Xiaojian (Shawn) Ma, Haozhe Zhao et al.
NEURIPS 2025posterarXiv:2508.06044