Poster "region captioning" Papers
3 papers found
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
Chuofan Ma, Yi Jiang, Jiannan Wu et al.
ECCV 2024posterarXiv:2404.13013
107
citations
NExT-Chat: An LMM for Chat, Detection and Segmentation
Ao Zhang, Yuan Yao, Wei Ji et al.
ICML 2024poster
Tokenize Anything via Prompting
Ting Pan, Lulu Tang, Xinlong Wang et al.
ECCV 2024posterarXiv:2312.09128
35
citations