Jiasen Lu
4
Papers
150
Total Citations
Papers (4)
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
96
citations
One Diffusion to Generate Them All
CVPR 2025
34
citations
STIV: Scalable Text and Image Conditioned Video Generation
ICCV 2025
20
citations
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action
CVPR 2024
0
citations