NEURIPS 2025 "cross-modal alignment" Papers
10 papers found
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding
Ahmed Masry, Juan Rodriguez, Tianyu Zhang et al.
NEURIPS 2025posterarXiv:2502.01341
Amplifying Prominent Representations in Multimodal Learning via Variational Dirichlet Process
Tsai Hor Chan, Feng Wu, Yihang Chen et al.
NEURIPS 2025posterarXiv:2510.20736
Beyond Modality Collapse: Representation Blending for Multimodal Dataset Distillation
xin zhang, Ziruo Zhang, JIAWEI DU et al.
NEURIPS 2025posterarXiv:2505.14705
3
citations
CF-VLM:CounterFactual Vision-Language Fine-tuning
jusheng zhang, Kaitong Cai, Yijia Fan et al.
NEURIPS 2025poster
Learning Source-Free Domain Adaptation for Visible-Infrared Person Re-Identification
Yongxiang Li, Yanglin Feng, Yuan Sun et al.
NEURIPS 2025poster
Robust Cross-modal Alignment Learning for Cross-Scene Spatial Reasoning and Grounding
Yanglin Feng, Hongyuan Zhu, Dezhong Peng et al.
NEURIPS 2025poster
Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers
Chaehyun Kim, Heeseong Shin, Eunbeen Hong et al.
NEURIPS 2025poster
6
citations
SGAR: Structural Generative Augmentation for 3D Human Motion Retrieval
Jiahang Zhang, Lilang Lin, Shuai Yang et al.
NEURIPS 2025poster
The Indra Representation Hypothesis
Jianglin Lu, Hailing Wang, Kuo Yang et al.
NEURIPS 2025poster
When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product
Youqi WU, Jingwei Zhang, Farzan Farnia
NEURIPS 2025posterarXiv:2506.08645
2
citations