AAAI Paper "vision-language models" Papers
18 papers found
Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Youjun Zhao, Jiaying Lin, Rynson W. H. Lau
Language Prompt for Autonomous Driving
Dongming Wu, Wencheng Han, Yingfei Liu et al.
Learning to Prompt with Text Only Supervision for Vision-Language Models
Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer et al.
Mind the Uncertainty in Human Disagreement: Evaluating Discrepancies Between Model Predictions and Human Responses in VQA
Jian Lan, Diego Frassinelli, Barbara Plank
Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection
Kaiqing Lin, Yuzhen Lin, Weixiang Li et al.
AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models
Zhaopeng Gu, Bingke Zhu, Guibo Zhu et al.
CLIM: Contrastive Language-Image Mosaic for Region Representation
Size Wu, Wenwei Zhang, Lumin XU et al.
COMMA: Co-articulated Multi-Modal Learning
Authors: Lianyu Hu, Liqing Gao, Zekang Liu et al.
Compound Text-Guided Prompt Tuning via Image-Adaptive Cues
Hao Tan, Jun Li, Yizhuang Zhou et al.
Delving into Multimodal Prompting for Fine-Grained Visual Classification
Xin Jiang, Hao Tang, Junyao Gao et al.
Domain-Controlled Prompt Learning
Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models
Yubin Wang, Xinyang Jiang, De Cheng et al.
Multi-Prompts Learning with Cross-Modal Alignment for Attribute-Based Person Re-identification
Yajing Zhai, Yawen Zeng, Zhiyong Huang et al.
p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models
Haoyuan Wu, Xinyun Zhang, Peng Xu et al.
Semantic-Aware Data Augmentation for Text-to-Image Synthesis
Zhaorui Tan, Xi Yang, Kaizhu Huang
Simple Image-Level Classification Improves Open-Vocabulary Object Detection
Ruohuan Fang, Guansong Pang, Xiao Bai
VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection
Peng Wu, Xuerong Zhou, Guansong Pang et al.
Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning
Kun Ding, Haojian Zhang, Qiang Yu et al.