"clip model" Papers
12 papers found
Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
Jihoon Kwon, Kyle Min, Jy-yong Sohn
NeurIPS 2025posterarXiv:2510.16540
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP
Yusuke Hirota, Min-Hung Chen, Chien-Yi Wang et al.
ICLR 2025posterarXiv:2408.10202
11
citations
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu, Weihao Yu, Xinchao Wang
ECCV 2024posterarXiv:2409.17143
28
citations
Data-Free Generalized Zero-Shot Learning
Bowen Tang, Jing Zhang, Yan Long et al.
AAAI 2024paperarXiv:2401.15657
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
Tong Shao, Zhuotao Tian, Hang Zhao et al.
ECCV 2024posterarXiv:2407.08268
44
citations
Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning
Shangchao Su, Mingzhao Yang, Bin Li et al.
AAAI 2024paperarXiv:2211.07864
37
citations
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection
Dongmei Zhang, Chang Li, Renrui Zhang et al.
AAAI 2024paperarXiv:2312.14465
22
citations
LAMM: Label Alignment for Multi-Modal Prompt Learning
Jingsheng Gao, Jiacheng Ruan, Suncheng Xiang et al.
AAAI 2024paperarXiv:2312.08212
28
citations
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Christian Schlarmann, Naman Singh, Francesco Croce et al.
ICML 2024poster
V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models
Heng Wang, Jianbo Ma, Santiago Pascual et al.
AAAI 2024paperarXiv:2308.09300
74
citations
VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation
Zhen Qu, Xian Tao, Mukesh Prasad et al.
ECCV 2024posterarXiv:2407.12276
55
citations
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Jinhao Li, Haopeng Li, Sarah Erfani et al.
ICML 2024poster