CVPR "contrastive learning" Papers
22 papers found
Active Data Curation Effectively Distills Large-Scale Multimodal Models
Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.
Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.
AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer
Jin Lyu, Tianyi Zhu, Yi Gu et al.
Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis
Hanbin Ko, Chang Min Park
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
Edson Araujo, Andrew Rouditchenko, Yuan Gong et al.
Context-Aware Multimodal Pretraining
Karsten Roth, Zeynep Akata, Dima Damen et al.
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset
Xiao Wang, Fuling Wang, Yuehang Li et al.
DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders
Sizai Hou, Songze Li, Duanyi Yao
DyCON: Dynamic Uncertainty-aware Consistency and Contrastive Learning for Semi-supervised Medical Image Segmentation
Maregu Assefa, Muzammal Naseer, IYYAKUTTI IYAPPAN GANAPATHI et al.
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
Gaoxiang Cong, Jiadong Pan, Liang Li et al.
Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation
Kang Liu, Zhuoqi Ma, Xiaolu Kang et al.
Gaussian Splatting Feature Fields for (Privacy-Preserving) Visual Localization
Maxime Pietrantoni, Gabriela Csurka, Torsten Sattler
Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Sherry X. Chen, Misha Sra, Pradeep Sen
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval
Boseung Jeong, Jicheol Park, Sungyeon Kim et al.
Learning Visual Composition through Improved Semantic Guidance
Austin Stone, Hagen Soltau, Robert Geirhos et al.
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
Shuwei Shi, Biao Gong, Xi Chen et al.
Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting
Runsong Zhu, Shi Qiu, ZHENGZHE LIU et al.
SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Shaoan Xie, Lingjing Kong, Yujia Zheng et al.
Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark
Hao Guo, Xugong Qin, Jun Jie Ou Yang et al.
VladVA: Discriminative Fine-tuning of LVLMs
Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.
WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation
Silin Cheng, Yang Liu, Xinwei He et al.