Zehuan Yuan

6

Papers

152

Total Citations

Papers (6)

General Object Foundation Model for Images and Videos at Scale

Goku: Flow Based Video Generative Foundation Models

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Infinity∞: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Generative Region-Language Pretraining for Open-Ended Object Detection