Poster "multi-modal learning" Papers
16 papers found
DIFFER: Disentangling Identity Features via Semantic Cues for Clothes-Changing Person Re-ID
Xin Liang, Yogesh S. Rawat
CVPR 2025posterarXiv:2503.22912
9
citations
Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems
Saeed Amizadeh, Sara Abdali, Yinheng Li et al.
NeurIPS 2025posterarXiv:2509.15448
Hierarchical Semantic-Augmented Navigation: Optimal Transport and Graph-Driven Reasoning for Vision-Language Navigation
Xiang Fang, Wanlong Fang, Changshuo Wang
NeurIPS 2025poster
Incomplete Multi-view Deep Clustering with Data Imputation and Alignment
Jiyuan Liu, Xinwang Liu, Xinhang Wan et al.
NeurIPS 2025poster
8
citations
Learning Diagrams: A Graphical Language for Compositional Training Regimes
Mason Lary, Richard Samuelson, Alexander Wilentz et al.
ICLR 2025poster
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.
ICLR 2025posterarXiv:2410.17637
19
citations
Multi-modal Knowledge Distillation-based Human Trajectory Forecasting
Jaewoo Jeong, Seohee Lee, Daehee Park et al.
CVPR 2025posterarXiv:2503.22201
6
citations
Multi-modal Learning: A Look Back and the Road Ahead
Divyam Madaan, Sumit Chopra, Kyunghyun Cho
ICLR 2025poster
Rethinking Vision-Language Model in Face Forensics: Multi-Modal Interpretable Forged Face Detector
Xiao Guo, Xiufeng Song, Yue Zhang et al.
CVPR 2025posterarXiv:2503.20188
24
citations
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
Yingying Zhang, Lixiang Ru, Kang Wu et al.
ICCV 2025posterarXiv:2507.13812
7
citations
SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction
Enrico Pallotta, Sina Mokhtarzadeh Azar, Shuai Li et al.
CVPR 2025posterarXiv:2503.18933
Understanding Contrastive Learning via Gaussian Mixture Models
Parikshit Bansal, Ali Kavis, Sujay Sanghavi
NeurIPS 2025poster
3
citations
DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment
Jiuming Liu, Dong Zhuo, Zhiheng Feng et al.
ECCV 2024posterarXiv:2403.18274
36
citations
ReconBoost: Boosting Can Achieve Modality Reconcilement
Cong Hua, Qianqian Xu, Shilong Bao et al.
ICML 2024posterarXiv:2405.09321
SkyScenes: A Synthetic Dataset for Aerial Scene Understanding
Sahil Santosh Khose, Anisha Pal, Aayushi Agarwal et al.
ECCV 2024posterarXiv:2312.06719
7
citations
Transferring Knowledge From Large Foundation Models to Small Downstream Models
Shikai Qiu, Boran Han, Danielle Robinson et al.
ICML 2024posterarXiv:2406.07337