Xinyu Wei
4
Papers
76
Total Citations
Papers (4)
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
NeurIPS 2025arXiv
29
citations
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
ICLR 2025
26
citations
Cloud-Device Collaborative Learning for Multimodal Large Language Models
CVPR 2024
18
citations
Event2Tracking: Reconstructing Multi-Agent Soccer Trajectories Using Long-Term Multimodal Context
AAAI 2025
3
citations