2025 "image understanding" Papers
8 papers found
Auto-Vocabulary Semantic Segmentation
Osman Ülger, Maksymilian Kulicki, Yuki Asano et al.
ICCV 2025posterarXiv:2312.04539
4
citations
Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification
Dongseob Kim, Hyunjung Shim
CVPR 2025posterarXiv:2503.16873
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Jiahui Gao, Renjie Pi, Jipeng Zhang et al.
ICLR 2025posterarXiv:2312.11370
170
citations
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
Weijia Shi, Xiaochuang Han, Chunting Zhou et al.
NEURIPS 2025posterarXiv:2412.15188
79
citations
SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
Jiaqi Huang, Zunnan Xu, Jun Zhou et al.
NEURIPS 2025posterarXiv:2505.22596
8
citations
SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference
Samir Khaki, Junxian Guo, Jiaming Tang et al.
ICCV 2025posterarXiv:2510.17777
1
citations
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Hao Li, Changyao TIAN, Jie Shao et al.
CVPR 2025posterarXiv:2412.09604
35
citations
Vinci: Deep Thinking in Text-to-Image Generation using Unified Model with Reinforcement Learning
wang lin, Wentao Hu, Liyu Jia et al.
NEURIPS 2025poster