"multimodal learning" Papers

19 papers found

$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs

Vlad Sobal, Mark Ibrahim, Randall Balestriero et al.

ICLR 2025posterarXiv:2407.18134
12
citations

Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering

Imad Eddine MAROUF, Enzo Tartaglione, Stéphane Lathuilière et al.

ICCV 2025posterarXiv:2502.04469
1
citations

Can Text-to-Video Generation help Video-Language Alignment?

Luca Zanella, Massimiliano Mancini, Willi Menapace et al.

CVPR 2025posterarXiv:2503.18507
1
citations

Improving Multimodal Learning via Imbalanced Learning

Shicai Wei, Chunbo Luo, Yang Luo

ICCV 2025posterarXiv:2507.10203
4
citations

Learning Diffusion Models with Flexible Representation Guidance

Chenyu Wang, Cai Zhou, Sharut Gupta et al.

NeurIPS 2025posterarXiv:2507.08980
5
citations

Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning

Hossein Rajoli Nowdeh, Jie Ji, Xiaolong Ma et al.

NeurIPS 2025posterarXiv:2510.24919

ModuLM: Enabling Modular and Multimodal Molecular Relational Learning with Large Language Models

Zhuo Chen, YIZHEN ZHENG, Huan Yee Koh et al.

NeurIPS 2025posterarXiv:2506.00880
1
citations

NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics

David Robinson, Marius Miron, Masato Hagiwara et al.

ICLR 2025posterarXiv:2411.07186
23
citations

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Jiaben Chen, Zixin Wang, AILING ZENG et al.

NeurIPS 2025posterarXiv:2510.07249
3
citations

Vision‑Language‑Vision Auto‑Encoder: Scalable Knowledge Distillation from Diffusion Models

Tiezheng Zhang, Yitong Li, Yu-Cheng Chou et al.

NeurIPS 2025posterarXiv:2507.07104
2
citations

CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts

Yichao Cai, Yuhang Liu, Zhen Zhang et al.

ECCV 2024posterarXiv:2311.16445
11
citations

Contrasting Multiple Representations with the Multi-Marginal Matching Gap

Zoe Piran, Michal Klein, James Thornton et al.

ICML 2024poster

Enhancing Storage and Computational Efficiency in Federated Multimodal Learning for Large-Scale Models

Zixin Zhang, Fan Qi, Changsheng Xu

ICML 2024poster

Gradient-Guided Modality Decoupling for Missing-Modality Robustness

AAAI 2024paperarXiv:2402.16318

IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation

Kai Li, Runxuan Yang, Fuchun Sun et al.

ICML 2024oral

MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance

Yake Wei, Di Hu

ICML 2024poster

MULTISCRIPT: Multimodal Script Learning for Supporting Open Domain Everyday Tasks

Jingyuan Qi, Minqian Liu, Ying Shen et al.

AAAI 2024paperarXiv:2310.04965
3
citations

Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective

Yang Chen, Cong Fang, Zhouchen Lin et al.

ICML 2024poster

Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement

che liu, Zhongwei Wan, Cheng Ouyang et al.

ICML 2024poster