"vision transformers" Papers

63 papers found • Page 2 of 2

Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models

Hengyi Wang, Shiwei Tan, Hao Wang

ICML 2024poster

Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness

Honghao Chen, Zhang Yurong, xiaokun Feng et al.

ICML 2024poster

Robustness Tokens: Towards Adversarial Robustness of Transformers

Brian Pulfer, Yury Belousov, Slava Voloshynovskiy

ECCV 2024posterarXiv:2503.10191

Sample-specific Masks for Visual Reprogramming-based Prompting

Chengyi Cai, Zesheng Ye, Lei Feng et al.

ICML 2024spotlight

Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications

Zixuan Hu, Yongxian Wei, Li Shen et al.

ICML 2024poster

Spatial Transform Decoupling for Oriented Object Detection

Hongtian Yu, Yunjie Tian, Qixiang Ye et al.

AAAI 2024paperarXiv:2308.10561

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

Xixu Hu, Runkai Zheng, Jindong Wang et al.

ECCV 2024posterarXiv:2402.03317
5
citations

Stitched ViTs are Flexible Vision Backbones

Zizheng Pan, Jing Liu, Haoyu He et al.

ECCV 2024posterarXiv:2307.00154
4
citations

Sub-token ViT Embedding via Stochastic Resonance Transformers

Dong Lao, Yangchao Wu, Tian Yu Liu et al.

ICML 2024poster

TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation

Yuhao Wang, Xuehu Liu, Pingping Zhang et al.

AAAI 2024paperarXiv:2312.09612
45
citations

Vision Transformers as Probabilistic Expansion from Learngene

Qiufeng Wang, Xu Yang, Haokun Chen et al.

ICML 2024poster

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

Dezhi Peng, Chongyu Liu, Yuliang Liu et al.

AAAI 2024paperarXiv:2306.12106
18
citations

xT: Nested Tokenization for Larger Context in Large Images

Ritwik Gupta, Shufan Li, Tyler Zhu et al.

ICML 2024poster