"cross-attention mechanisms" Papers
4 篇论文
Commonsense for Zero-Shot Natural Language Video Localization
Meghana Holla, Ismini Lourentzou
AAAI 2024paperarXiv:2312.17429
5
citations
Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
Ruichen Wang, Zekang Chen, Chen Chen et al.
AAAI 2024paperarXiv:2305.13921
92
citations
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Danni Yang, Ruohan Dong, Jiayi Ji et al.
ECCV 2024posterarXiv:2407.05352
9
citations
Revealing Vision-Language Integration in the Brain with Multimodal Networks
Vighnesh Subramaniam, Colin Conwell, Christopher Wang et al.
ICML 2024poster