ICLR 2025 "multimodal models" Papers
8 papers found
Context-aware Dynamic Pruning for Speech Foundation Models
Masao Someki, Yifan Peng, Siddhant Arora et al.
ICLR 2025poster
7
citations
Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)
SUBBA REDDY OOTA, Akshett Rai Jindal, Ishani Mondal et al.
ICLR 2025posterarXiv:2505.20029
4
citations
Diff-Prompt: Diffusion-driven Prompt Generator with Mask Supervision
Weicai Yan, Wang Lin, Zirun Guo et al.
ICLR 2025posterarXiv:2504.21423
6
citations
ElasticTok: Adaptive Tokenization for Image and Video
Wilson Yan, Volodymyr Mnih, Aleksandra Faust et al.
ICLR 2025posterarXiv:2410.08368
21
citations
Matryoshka Multimodal Models
Mu Cai, Jianwei Yang, Jianfeng Gao et al.
ICLR 2025posterarXiv:2405.17430
58
citations
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
yuntao du, Kailin Jiang, Zhi Gao et al.
ICLR 2025posterarXiv:2502.19870
9
citations
Reconstructive Visual Instruction Tuning
Haochen Wang, Anlin Zheng, Yucheng Zhao et al.
ICLR 2025posterarXiv:2410.09575
34
citations
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Seil Kang, Jinyeong Kim, Junhyeok Kim et al.
ICLR 2025posterarXiv:2503.03321
52
citations