Zhaokai Wang
4
Papers
102
Total Citations
Papers (4)
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
CVPR 2025
68
citations
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
CVPR 2025
34
citations
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
CVPR 2024
0
citations
Video Background Music Generation: Dataset, Method and Evaluation
ICCV 2023arXiv
0
citations