2024 Poster "multimodal large language models" Papers

13 papers found

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

Xiangming Gu, Xiaosen Zheng, Tianyu Pang et al.

ICML 2024poster

DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM

Yixuan Wu, Yizhou Wang, Shixiang Tang et al.

ECCV 2024posterarXiv:2403.12488
47
citations

Grounding Language Models for Visual Entity Recognition

Zilin Xiao, Ming Gong, Paola Cascante-Bonilla et al.

ECCV 2024posterarXiv:2402.18695
13
citations

Improving Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning

Wei Li, Hehe Fan, Yongkang Wong et al.

ICML 2024poster

LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang

Yuqing Zhang, Hangqi Li, Shengyu Zhang et al.

ECCV 2024poster
6
citations

LLMGA: Multimodal Large Language Model based Generation Assistant

Bin Xia, Shiyin Wang, Yingfan Tao et al.

ECCV 2024posterarXiv:2311.16500
25
citations

Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning

Zhuo Huang, Chang Liu, Yinpeng Dong et al.

ICML 2024posterarXiv:2312.02546

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

Dongping Chen, Ruoxi Chen, Shilin Zhang et al.

ICML 2024poster

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models

Xin Liu, Yichen Zhu, Jindong Gu et al.

ECCV 2024posterarXiv:2311.17600
183
citations

NExT-GPT: Any-to-Any Multimodal LLM

Shengqiong Wu, Hao Fei, Leigang Qu et al.

ICML 2024poster

REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models

Agneet Chatterjee, Yiran Luo, Tejas Gokhale et al.

ECCV 2024posterarXiv:2408.02231
10
citations

RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models

Qi Lv, Hao Li, Xiang Deng et al.

ICML 2024poster

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World

Weiyun Wang Weiyun, yiming ren, Haowen Luo et al.

ECCV 2024posterarXiv:2402.19474
86
citations