ICLR "vision-language tasks" Papers
4 papers found
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo, Zhenglin Cheng, Xiaoying Tang et al.
ICLR 2025posterarXiv:2405.14297
33
citations
Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer
Ningyuan Zhang, Jie Lu, Keqiuyin Li et al.
ICLR 2025poster
1
citations
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Seil Kang, Jinyeong Kim, Junhyeok Kim et al.
ICLR 2025posterarXiv:2503.03321
52
citations
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Jinheng Xie, Weijia Mao, Zechen Bai et al.
ICLR 2025posterarXiv:2408.12528
455
citations