CVPR Poster "multimodal understanding" Papers
2 papers found
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
Pengfei Zhou, Xiaopeng Peng, Jiajun Song et al.
CVPR 2025posterarXiv:2411.18499
19
citations
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation
Liao Qu, Huichao Zhang, Yiheng Liu et al.
CVPR 2025posterarXiv:2412.03069
120
citations