2024 "vision language models" Papers
12 papers found
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
Wenbo Hu, Yifan Xu, Yi Li et al.
AAAI 2024paperarXiv:2308.09936
190
citations
Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using VLMs
Jeongkee Lim, Yusung Kim
ECCV 2024posterarXiv:2408.02261
5
citations
Detecting and Preventing Hallucinations in Large Vision Language Models
Anisha Gunjal, Jihan Yin, Erhan Bas
AAAI 2024paperarXiv:2308.06394
256
citations
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
Jin Wang, Shichao Dong, Yapeng Zhu et al.
ICML 2024posterarXiv:2405.17201
LCA-on-the-Line: Benchmarking Out of Distribution Generalization with Class Taxonomies
Jia Shi, Gautam Rajendrakumar Gare, Jinjin Tian et al.
ICML 2024poster
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Rishabh Kabra, Loic Matthey, Alexander Lerchner et al.
ICML 2024posterarXiv:2311.17851
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.
ECCV 2024posterarXiv:2312.03766
17
citations
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Soroush Nasiriany, Fei Xia, Wenhao Yu et al.
ICML 2024posterarXiv:2402.07872
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open Vocabulary Object Detection
Joonhyun Jeong, Geondo Park, Jayeon Yoo et al.
AAAI 2024paperarXiv:2312.07266
16
citations
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Yufei Wang, Zhanyi Sun, Jesse Zhang et al.
ICML 2024posterarXiv:2402.03681
Soft Prompt Generation for Domain Generalization
Shuanghao Bai, Yuedi Zhang, Wanqi Zhou et al.
ECCV 2024posterarXiv:2404.19286
30
citations
TrojVLM: Backdoor Attack Against Vision Language Models
Weimin Lyu, Lu Pang, Tengfei Ma et al.
ECCV 2024posterarXiv:2409.19232
23
citations