XIAOJUAN QI
4
Papers
45
Total Citations
Papers (4)
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
NeurIPS 2025arXiv
18
citations
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
ICLR 2025arXiv
12
citations
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Generation
NeurIPS 2025
8
citations
EA-VTR: Event-Aware Video-Text Retrieval
ECCV 2024arXiv
7
citations