Haiyang Xu
9
Papers
996
Total Citations
Papers (9)
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
CVPR 2024arXiv
601
citations
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
ICLR 2025arXiv
237
citations
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
CVPR 2024
116
citations
Bayesian Diffusion Models for 3D Shape Reconstruction
CVPR 2024arXiv
23
citations
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
CVPR 2025arXiv
7
citations
YOLO-Count: Differentiable Object Counting for Text-to-Image Generation
ICCV 2025arXiv
6
citations
TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training
AAAI 2024arXiv
6
citations
DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion
ICCV 2025
0
citations
Science-T2I: Addressing Scientific Illusions in Image Synthesis
CVPR 2025
0
citations