2025 "vision encoder" Papers
2 papers found
LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs
Haoran Lou, Chunxiao Fan, Ziyan Liu et al.
ICCV 2025posterarXiv:2507.00505
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
Xianhang Li, Yanqing Liu, Haoqin Tu et al.
ICCV 2025posterarXiv:2505.04601
6
citations