Haiyang Xu

9

Papers

996

Total Citations

Papers (9)

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

Bayesian Diffusion Models for 3D Shape Reconstruction

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

YOLO-Count: Differentiable Object Counting for Text-to-Image Generation

TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Science-T2I: Addressing Scientific Illusions in Image Synthesis