ICLR 2025 "cross-modal alignment" Papers
6 papers found
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
Jinlan Fu, Shenzhen Huangfu, Hao Fei et al.
ICLR 2025posterarXiv:2501.16629
19
citations
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding
Henry Zheng, Hao Shi, Qihang Peng et al.
ICLR 2025posterarXiv:2505.04965
8
citations
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Xiangru Zhu, Penglei Sun, Yaoxian Song et al.
ICLR 2025posterarXiv:2410.10291
2
citations
Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
Yue Wu, Zhaobo Qi, Yiling Wu et al.
ICLR 2025poster
7
citations
Mitigate the Gap: Improving Cross-Modal Alignment in CLIP
Sedigheh Eslami, Gerard de Melo
ICLR 2025poster
14
citations
Semi-Supervised CLIP Adaptation by Enforcing Semantic and Trapezoidal Consistency
Kai Gan, Bo Ye, Min-Ling Zhang et al.
ICLR 2025poster
3
citations