2025 Poster "visual comprehension" Papers
4 papers found
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
Kaihang Pan, Wang Lin, Zhongqi Yue et al.
CVPR 2025posterarXiv:2504.14666
18
citations
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Kaihang Pan, Yang Wu, Wendong Bu et al.
NEURIPS 2025posterarXiv:2506.01480
6
citations
LIRA: Inferring Segmentation in Large Multi-modal Models with Local Interleaved Region Assistance
Zhang Li, Biao Yang, Qiang Liu et al.
ICCV 2025posterarXiv:2507.06272
1
citations
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
Hongbo Liu, Jingwen He, Yi Jin et al.
NEURIPS 2025posterarXiv:2506.21356
7
citations