"attention mechanism" Papers

293 papers found • Page 6 of 6

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NeurIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

PIDformer: Transformer Meets Control Theory

Tam Nguyen, Cesar Uribe, Tan Nguyen et al.

ICML 2024posterarXiv:2402.15989

PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation

Han Fu, Jian Tan, Pinhan Zhang et al.

ICML 2024poster

PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer

Tongkun Guan, Chengyu Lin, Wei Shen et al.

ECCV 2024posterarXiv:2407.07764

citations

Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning

Junfeng CHEN, Kailiang Wu

ICML 2024posterarXiv:2405.09285

Prompting a Pretrained Transformer Can Be a Universal Approximator

Aleksandar Petrov, Phil Torr, Adel Bibi

ICML 2024posterarXiv:2402.14753

Prospector Heads: Generalized Feature Attribution for Large Models & Data

Gautam Machiraju, Alexander Derry, Arjun Desai et al.

ICML 2024posterarXiv:2402.11729

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Xiuquan Hou, Meiqin Liu, Senlin Zhang et al.

ECCV 2024posterarXiv:2407.11699

citations

Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering

Haoxuan Li, Chunyuan Zheng, Shuyi Wang et al.

ICML 2024spotlight

Repeat After Me: Transformers are Better than State Space Models at Copying

Samy Jelassi, David Brandfonbrener, Sham Kakade et al.

ICML 2024posterarXiv:2402.01032

RPBG: Towards Robust Neural Point-based Graphics in the Wild

Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng et al.

ECCV 2024posterarXiv:2405.05663

citations

S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention

Chiyu Zhang, Xiaogang Xu, Lei Wang et al.

AAAI 2024paperarXiv:2210.12381

citations

SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention

Romain Ilbert, Ambroise Odonnat, Vasilii Feofanov et al.

ICML 2024posterarXiv:2402.10198

ScanERU: Interactive 3D Visual Grounding Based on Embodied Reference Understanding

Ziyang Lu, Yunqiang Pei, Guoqing Wang et al.

AAAI 2024paperarXiv:2303.13186

Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection

Tim Salzmann, Markus Ryll, Alex Bewley et al.

ECCV 2024posterarXiv:2403.14270

citations

Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes

Yingyi Chen, Qinghua Tao, Francesco Tonin et al.

ICML 2024posterarXiv:2402.01476

Semantic-Aware Data Augmentation for Text-to-Image Synthesis

Zhaorui Tan, Xi Yang, Kaizhu Huang

AAAI 2024paperarXiv:2312.07951

citations

Semantic Lens: Instance-Centric Semantic Alignment for Video Super-resolution

AAAI 2024paperarXiv:2312.07823

SeTformer Is What You Need for Vision and Language

Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger et al.

AAAI 2024paperarXiv:2401.03540

citations

SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds

Yanbo Wang, Wentao Zhao, Cao Chuan et al.

ECCV 2024posterarXiv:2407.11569

citations

Simple linear attention language models balance the recall-throughput tradeoff

Simran Arora, Sabri Eyuboglu, Michael Zhang et al.

ICML 2024spotlightarXiv:2402.18668

SparQ Attention: Bandwidth-Efficient LLM Inference

Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley et al.

ICML 2024posterarXiv:2312.04985

Sparse and Structured Hopfield Networks

Saúl Santos, Vlad Niculae, Daniel McNamee et al.

ICML 2024spotlightarXiv:2402.13725

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

Xixu Hu, Runkai Zheng, Jindong Wang et al.

ECCV 2024posterarXiv:2402.03317

citations

SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation

Malyaban Bal, Abhronil Sengupta

AAAI 2024paperarXiv:2308.10873

citations

StableMask: Refining Causal Masking in Decoder-only Transformer

Qingyu Yin, Xuzheng He, Xiang Zhuang et al.

ICML 2024posterarXiv:2402.04779

Statistical Test for Attention Maps in Vision Transformers

Tomohiro Shiraishi, Daiki Miwa, Teruyuki Katsuoka et al.

ICML 2024poster

Stripe Observation Guided Inference Cost-free Attention Mechanism

Zhongzhan Huang, Shanshan Zhong, Wushao Wen et al.

ECCV 2024poster

citations

Subgraphormer: Unifying Subgraph GNNs and Graph Transformers via Graph Products

Guy Bar Shalom, Beatrice Bevilacqua, Haggai Maron

ICML 2024posterarXiv:2402.08450

Tandem Transformers for Inference Efficient LLMs

Aishwarya P S, Pranav Nair, Yashas Samaga et al.

ICML 2024posterarXiv:2402.08644

TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling

Dong Huo, Zixin Guo, Xinxin Zuo et al.

ECCV 2024posterarXiv:2408.01291

citations

Towards Diverse Perspective Learning with Selection over Multiple Temporal Poolings

Jihyeon Seong, Jungmin Kim, Jaesik Choi

AAAI 2024paperarXiv:2403.09749

citations

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Siyu Zou, Jiji Tang, Yiyi Zhou et al.

AAAI 2024paperarXiv:2401.07709

citations

Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration

Zhengyang Zhuge, Peisong Wang, Xingting Yao et al.

ICML 2024poster

Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features

Simone Bombari, Marco Mondelli

ICML 2024posterarXiv:2402.02969

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Tri Dao, Albert Gu

ICML 2024posterarXiv:2405.21060

Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape

Juno Kim, Taiji Suzuki

ICML 2024posterarXiv:2402.01258

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Zhongzhi Yu, Zheng Wang, Yonggan Fu et al.

ICML 2024posterarXiv:2406.15765

Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations

Jan Hagnberger, Marimuthu Kalimuthu, Daniel Musekamp et al.

ICML 2024oralarXiv:2406.03919

Viewing Transformers Through the Lens of Long Convolutions Layers

Itamar Zimerman, Lior Wolf

ICML 2024poster

Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach

Yancheng Wang, Ping Li, Yingzhen Yang

ICML 2024poster

ViT-Calibrator: Decision Stream Calibration for Vision Transformer

Lin Chen, Zhijie Jia, Lechao Cheng et al.

AAAI 2024paperarXiv:2304.04354

citations

Wavelength-Embedding-guided Filter-Array Transformer for Spectral Demosaicing

haijin zeng, Hiep Luong, Wilfried Philips

ECCV 2024poster

citations

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

Xingwu Chen, Difan Zou

ICML 2024posterarXiv:2404.01601

← Previous

1...4 5 6