CVPR Poster "multimodal alignment" Papers
5 papers found
AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Haonan Han, Xiangzuo Wu, Huan Liao et al.
CVPR 2025posterarXiv:2411.18654
5
citations
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset
Xiao Wang, Fuling Wang, Yuehang Li et al.
CVPR 2025posterarXiv:2410.00379
16
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Kai Chen, Yunhao Gou, Runhui Huang et al.
CVPR 2025posterarXiv:2409.18042
44
citations
Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.
CVPR 2025posterarXiv:2409.19425
2
citations
Self-Supervised Spatial Correspondence Across Modalities
Ayush Shrivastava, Andrew Owens
CVPR 2025posterarXiv:2506.03148
2
citations