Poster "cross-attention mechanism" Papers

18 papers found

A Conditional Probability Framework for Compositional Zero-shot Learning

Peng Wu, Qiuxia Lai, Hao Fang et al.

ICCV 2025posterarXiv:2507.17377
1
citations

Causal Disentanglement and Cross-Modal Alignment for Enhanced Few-Shot Learning

Tianjiao Jiang, Zhen Zhang, Yuhang Liu et al.

ICCV 2025posterarXiv:2508.03102
1
citations

CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes

Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim et al.

ICLR 2025posterarXiv:2405.01033
18
citations

DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy

Rui Zhao, Yuze Fan, Ziguo Chen et al.

NeurIPS 2025poster

InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences

Chenyang Zhu, Kai Li, Yue Ma et al.

ICLR 2025posterarXiv:2412.01197
29
citations

LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs

Hanyu Zhou, Gim Hee Lee

ICCV 2025posterarXiv:2503.06934
2
citations

LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs

Haoran Lou, Chunxiao Fan, Ziyan Liu et al.

ICCV 2025posterarXiv:2507.00505

RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Chest X-ray with Zero-Shot Multi-Task Capability

Jonggwon Park, Byungmu Yoon, Soobum Kim et al.

NeurIPS 2025posterarXiv:2504.07416
1
citations

Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts

Chen Li, Huiying Xu, Changxin Gao et al.

NeurIPS 2025posterarXiv:2510.19487

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025posterarXiv:2508.10407

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation

Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.

CVPR 2025posterarXiv:2412.18928
7
citations

X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention

XiaoChen Zhao, Hongyi Xu, Guoxian Song et al.

ICLR 2025posterarXiv:2507.23143
17
citations

Image Fusion via Vision-Language Model

Zixiang Zhao, Lilun Deng, Haowen Bai et al.

ICML 2024posterarXiv:2402.02235

Meta Evidential Transformer for Few-Shot Open-Set Recognition

Hitesh Sapkota, Krishna Neupane, Qi Yu

ICML 2024poster

NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image

Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim et al.

ECCV 2024posterarXiv:2312.07315
9
citations

Point-supervised Panoptic Segmentation via Estimating Pseudo Labels from Learnable Distance

Jing Li, Junsong Fan, Zhaoxiang Zhang

ECCV 2024poster
2
citations

SemReg: Semantics Constrained Point Cloud Registration

Sheldon Fung, Xuequan Lu, Dasith de Silva Edirimuni et al.

ECCV 2024poster
7
citations

Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models

Luozhou Wang, Guibao Shen, Wenhang Ge et al.

ECCV 2024posterarXiv:2306.14408
5
citations