2025 "vision-language datasets" Papers
2 papers found
GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
Fengxiang Wang, Mingshuo Chen, Yueying Li et al.
NeurIPS 2025spotlightarXiv:2505.21375
11
citations
Semantic and Expressive Variations in Image Captions Across Languages
Andre Ye, Sebastin Santy, Jena D. Hwang et al.
CVPR 2025posterarXiv:2310.14356
5
citations