"vision-language integration" Papers
4 papers found
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Haiwen Diao, Xiaotong Li, Yufeng Cui et al.
ICCV 2025highlightarXiv:2502.06788
18
citations
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
Jiabo Ye, Haiyang Xu, Haowei Liu et al.
ICLR 2025posterarXiv:2408.04840
237
citations
Multi-Factor Adaptive Vision Selection for Egocentric Video Question Answering
Haoyu Zhang, Meng Liu, Zixin Liu et al.
ICML 2024oral
Revealing Vision-Language Integration in the Brain with Multimodal Networks
Vighnesh Subramaniam, Colin Conwell, Christopher Wang et al.
ICML 2024poster