2024 Poster "vision transformers" Papers
28 papers found
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Dongyoon Hwang, Byungkun Lee, Hojoon Lee et al.
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han, Tianzhu Ye, Yizeng Han et al.
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer et al.
AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors
Kaishen Yuan, Zitong Yu, Xin Liu et al.
Characterizing Model Robustness via Natural Input Gradients
Adrian Rodriguez-Munoz, Tongzhou Wang, Antonio Torralba
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Itamar Zimerman, Moran Baruch, Nir Drucker et al.
Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks
Mikkel Jordahn, Pablo Olmos
Denoising Vision Transformers
Jiawei Yang, Katie Luo, Jiefeng Li et al.
Fine-grained Local Sensitivity Analysis of Standard Dot-Product Self-Attention
Aaron Havens, Alexandre Araujo, Huan Zhang et al.
GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features
Luc Sträter, Mohammadreza Salehi, Efstratios Gavves et al.
Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning
Pengyu Li, Biao Wang, Tianchu Guo et al.
KernelWarehouse: Rethinking the Design of Dynamic Convolution
Chao Li, Anbang Yao
Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
Jihai Zhang, Xiang Lan, Xiaoye Qu et al.
LookupViT: Compressing visual information to a limited number of tokens
Rajat Koner, Gagan Jain, Sujoy Paul et al.
Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression
Dingyuan Zhang, Dingkang Liang, Zichang Tan et al.
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
Zhiyu Yao, Jian Wang, Haixu Wu et al.
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
Tanvir Mahmud, Burhaneddin Yaman, Chun-Hao Liu et al.
PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers
Ananthu Aniraj, Cassio F. Dantas, Dino Ienco et al.
Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation
Hoyong Kwon, Jaeseok Jeong, Sung-Hoon Yoon et al.
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models
Hengyi Wang, Shiwei Tan, Hao Wang
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Honghao Chen, Zhang Yurong, xiaokun Feng et al.
Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer, Yury Belousov, Slava Voloshynovskiy
Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications
Zixuan Hu, Yongxian Wei, Li Shen et al.
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
Xixu Hu, Runkai Zheng, Jindong Wang et al.
Stitched ViTs are Flexible Vision Backbones
Zizheng Pan, Jing Liu, Haoyu He et al.
Sub-token ViT Embedding via Stochastic Resonance Transformers
Dong Lao, Yangchao Wu, Tian Yu Liu et al.
Vision Transformers as Probabilistic Expansion from Learngene
Qiufeng Wang, Xu Yang, Haokun Chen et al.
xT: Nested Tokenization for Larger Context in Large Images
Ritwik Gupta, Shufan Li, Tyler Zhu et al.