2025 "large vision-language models" Papers
14 papers found
Compress & Cache: Vision token compression for efficient generation and retrieval
Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos
NEURIPS 2025poster
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
Shicheng Xu, Liang Pang, Yunchang Zhu et al.
ICLR 2025posterarXiv:2410.12662
14
citations
DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?
Liqiang Jing, Zhehui Huang, Xiaoyang Wang et al.
ICLR 2025posterarXiv:2409.07703
62
citations
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
Jie Zhang, Zhongqi Wang, Mengqi Lei et al.
ICLR 2025posterarXiv:2406.18849
2
citations
How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Seongyun Lee, Geewook Kim, Jiyeon Kim et al.
ICLR 2025posterarXiv:2410.07571
4
citations
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models
Junzhe Chen, Tianshu Zhang, Shiyu Huang et al.
CVPR 2025posterarXiv:2411.15268
11
citations
Latent Chain-of-Thought for Visual Reasoning
Guohao Sun, Hang Hua, Jian Wang et al.
NEURIPS 2025posterarXiv:2510.23925
7
citations
LVLM-Driven Attribute-Aware Modeling for Visible-Infrared Person Re-Identification
Zhiqi Pang, Lingling Zhao, Junjie Wang et al.
NEURIPS 2025poster
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
Xuannan Liu, Zekun Li, Pei Li et al.
ICLR 2025posterarXiv:2406.08772
49
citations
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
Cheng Chen, Yunpeng Zhai, Yifan Zhao et al.
CVPR 2025posterarXiv:2506.09473
1
citations
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
Xianzhe Fan, Xuhui Zhou, Chuanyang Jin et al.
NEURIPS 2025posterarXiv:2506.23046
5
citations
Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
Gianni Franchi, Nacim Belkhir, Dat NGUYEN et al.
CVPR 2025posterarXiv:2412.03178
3
citations
Visual Structures Help Visual Reasoning: Addressing the Binding Problem in LVLMs
Amirmohammad Izadi, Mohammadali Banayeeanzade, Fatemeh Askari et al.
NEURIPS 2025poster
1
citations
VladVA: Discriminative Fine-tuning of LVLMs
Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.
CVPR 2025posterarXiv:2412.04378
11
citations