"zero-shot classification" Papers
16 papers found
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Marco Mistretta, Alberto Baldrati, Lorenzo Agnolucci et al.
ICLR 2025posterarXiv:2502.04263
15
citations
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
Po-han Li, Sandeep Chinchali, ufuk topcu
ICLR 2025posterarXiv:2410.07610
5
citations
Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.
CVPR 2025posterarXiv:2409.19425
2
citations
MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
Min Yang, Zihan Jia, Zhilin Dai et al.
ICCV 2025posterarXiv:2508.07312
NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics
David Robinson, Marius Miron, Masato Hagiwara et al.
ICLR 2025posterarXiv:2411.07186
23
citations
Semi-Supervised CLIP Adaptation by Enforcing Semantic and Trapezoidal Consistency
Kai Gan, Bo Ye, Min-Ling Zhang et al.
ICLR 2025poster
3
citations
VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set
Shufan Shen, Junshu Sun, Qingming Huang et al.
NeurIPS 2025posterarXiv:2510.21323
1
citations
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
ICML 2024poster
CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts
Yichao Cai, Yuhang Liu, Zhen Zhang et al.
ECCV 2024posterarXiv:2311.16445
11
citations
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Samuel Lavoie, Polina Kirichenko, Mark Ibrahim et al.
ICML 2024poster
Multi-Label Cluster Discrimination for Visual Representation Learning
Xiang An, Kaicheng Yang, Xiangzi Dai et al.
ECCV 2024posterarXiv:2407.17331
12
citations
Online Zero-Shot Classification with CLIP
Qi Qian, JUHUA HU
ECCV 2024posterarXiv:2408.13320
21
citations
OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport
Liangliang Shi, Jack Fan, Junchi Yan
ICML 2024poster
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Christian Schlarmann, Naman Singh, Francesco Croce et al.
ICML 2024poster
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Ziping Ma, Furong Xu, Jian liu et al.
ICML 2024poster
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
che liu, Zhongwei Wan, Cheng Ouyang et al.
ICML 2024poster