AAAI 2024 "attention mechanism" Papers
25 papers found
AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model
Teng Hu, Jiangning Zhang, Ran Yi et al.
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
Jiaer Xia, Lei Tan, Pingyang Dai et al.
Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
Saebom Leem, Hyunseok Seo
BARET: Balanced Attention Based Real Image Editing Driven by Target-Text Inversion
Yuming Qiao, Fanyi Wang, Jingwen Su et al.
Cached Transformers: Improving Transformers with Differentiable Memory Cached
Zhaoyang Zhang, Wenqi Shao, Yixiao Ge et al.
Correlation Matching Transformation Transformers for UHD Image Restoration
Cong Wang, Jinshan Pan, Wei Wang et al.
Dynamic Feature Pruning and Consolidation for Occluded Person Re-identification
YuTeng Ye, Hang Zhou, Jiale Cai et al.
FaceCoresetNet: Differentiable Coresets for Face Set Recognition
Gil Shapira, Yosi Keller
Gated Attention Coding for Training High-Performance and Efficient Spiking Neural Networks
Xuerui Qiu, Rui-Jie Zhu, Yuhong Chou et al.
Graph Context Transformation Learning for Progressive Correspondence Pruning
Junwen Guo, Guobao Xiao, Shiping Wang et al.
GridFormer: Point-Grid Transformer for Surface Reconstruction
Shengtao Li, Ge Gao, Yudong Liu et al.
Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables
Haisong Gong, Weizhi Xu, Shu Wu et al.
Hierarchical Aligned Multimodal Learning for NER on Tweet Posts
Peipei Liu, Hong Li, Yimo Ren et al.
How to Protect Copyright Data in Optimization of Large Language Models?
Timothy Chu, Zhao Song, Chiwun Yang
KG-TREAT: Pre-training for Treatment Effect Estimation by Synergizing Patient Data with Knowledge Graphs
Ruoqi Liu, Lingfei Wu, Ping Zhang
Multi-Architecture Multi-Expert Diffusion Models
Yunsung Lee, Jin-Young Kim, Hyojun Go et al.
S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention
Chiyu Zhang, Xiaogang Xu, Lei Wang et al.
ScanERU: Interactive 3D Visual Grounding Based on Embodied Reference Understanding
Ziyang Lu, Yunqiang Pei, Guoqing Wang et al.
Semantic-Aware Data Augmentation for Text-to-Image Synthesis
Zhaorui Tan, Xi Yang, Kaizhu Huang
Semantic Lens: Instance-Centric Semantic Alignment for Video Super-resolution
SeTformer Is What You Need for Vision and Language
Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger et al.
SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation
Malyaban Bal, Abhronil Sengupta
Towards Diverse Perspective Learning with Selection over Multiple Temporal Poolings
Jihyeon Seong, Jungmin Kim, Jaesik Choi
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Siyu Zou, Jiji Tang, Yiyi Zhou et al.
ViT-Calibrator: Decision Stream Calibration for Vision Transformer
Lin Chen, Zhijie Jia, Lechao Cheng et al.