2025 Poster "multimodal learning" Papers

13 papers found

$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs

Vlad Sobal, Mark Ibrahim, Randall Balestriero et al.

ICLR 2025posterarXiv:2407.18134
12
citations

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

Chengxiang Huang, Yake Wei, Zequn Yang et al.

CVPR 2025posterarXiv:2503.18595
8
citations

Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering

Imad Eddine MAROUF, Enzo Tartaglione, Stéphane Lathuilière et al.

ICCV 2025posterarXiv:2502.04469
1
citations

Beyond Modality Collapse: Representation Blending for Multimodal Dataset Distillation

xin zhang, Ziruo Zhang, JIAWEI DU et al.

NeurIPS 2025posterarXiv:2505.14705
3
citations

Can Text-to-Video Generation help Video-Language Alignment?

Luca Zanella, Massimiliano Mancini, Willi Menapace et al.

CVPR 2025posterarXiv:2503.18507
1
citations

GeoMM: On Geodesic Perspective for Multi-modal Learning

Shibin Mei, Hang Wang, Bingbing Ni

CVPR 2025posterarXiv:2505.11216

Improving Multimodal Learning via Imbalanced Learning

Shicai Wei, Chunbo Luo, Yang Luo

ICCV 2025posterarXiv:2507.10203
4
citations

Learning Diffusion Models with Flexible Representation Guidance

Chenyu Wang, Cai Zhou, Sharut Gupta et al.

NeurIPS 2025posterarXiv:2507.08980
5
citations

Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning

Hossein Rajoli Nowdeh, Jie Ji, Xiaolong Ma et al.

NeurIPS 2025posterarXiv:2510.24919

ModuLM: Enabling Modular and Multimodal Molecular Relational Learning with Large Language Models

Zhuo Chen, YIZHEN ZHENG, Huan Yee Koh et al.

NeurIPS 2025posterarXiv:2506.00880
1
citations

NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics

David Robinson, Marius Miron, Masato Hagiwara et al.

ICLR 2025posterarXiv:2411.07186
23
citations

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Jiaben Chen, Zixin Wang, AILING ZENG et al.

NeurIPS 2025posterarXiv:2510.07249
3
citations

Vision‑Language‑Vision Auto‑Encoder: Scalable Knowledge Distillation from Diffusion Models

Tiezheng Zhang, Yitong Li, Yu-Cheng Chou et al.

NeurIPS 2025posterarXiv:2507.07104
2
citations