"attention mechanism" Papers

293 papers found • Page 6 of 6

PIDformer: Transformer Meets Control Theory

Tam Nguyen, Cesar Uribe, Tan Nguyen et al.

ICML 2024posterarXiv:2402.15989

PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation

Han Fu, Jian Tan, Pinhan Zhang et al.

ICML 2024poster

PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer

Tongkun Guan, Chengyu Lin, Wei Shen et al.

ECCV 2024posterarXiv:2407.07764
16
citations

Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning

Junfeng CHEN, Kailiang Wu

ICML 2024posterarXiv:2405.09285

Prompting a Pretrained Transformer Can Be a Universal Approximator

Aleksandar Petrov, Phil Torr, Adel Bibi

ICML 2024posterarXiv:2402.14753

Prospector Heads: Generalized Feature Attribution for Large Models & Data

Gautam Machiraju, Alexander Derry, Arjun Desai et al.

ICML 2024posterarXiv:2402.11729

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Xiuquan Hou, Meiqin Liu, Senlin Zhang et al.

ECCV 2024posterarXiv:2407.11699
61
citations

Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering

Haoxuan Li, Chunyuan Zheng, Shuyi Wang et al.

ICML 2024spotlight

Repeat After Me: Transformers are Better than State Space Models at Copying

Samy Jelassi, David Brandfonbrener, Sham Kakade et al.

ICML 2024posterarXiv:2402.01032

RPBG: Towards Robust Neural Point-based Graphics in the Wild

Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng et al.

ECCV 2024posterarXiv:2405.05663
5
citations

S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention

Chiyu Zhang, Xiaogang Xu, Lei Wang et al.

AAAI 2024paperarXiv:2210.12381
46
citations

SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention

Romain Ilbert, Ambroise Odonnat, Vasilii Feofanov et al.

ICML 2024posterarXiv:2402.10198

ScanERU: Interactive 3D Visual Grounding Based on Embodied Reference Understanding

Ziyang Lu, Yunqiang Pei, Guoqing Wang et al.

AAAI 2024paperarXiv:2303.13186

Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection

Tim Salzmann, Markus Ryll, Alex Bewley et al.

ECCV 2024posterarXiv:2403.14270
8
citations

Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes

Yingyi Chen, Qinghua Tao, Francesco Tonin et al.

ICML 2024posterarXiv:2402.01476

Semantic-Aware Data Augmentation for Text-to-Image Synthesis

Zhaorui Tan, Xi Yang, Kaizhu Huang

AAAI 2024paperarXiv:2312.07951
4
citations

Semantic Lens: Instance-Centric Semantic Alignment for Video Super-resolution

AAAI 2024paperarXiv:2312.07823

SeTformer Is What You Need for Vision and Language

Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger et al.

AAAI 2024paperarXiv:2401.03540
7
citations

SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds

Yanbo Wang, Wentao Zhao, Cao Chuan et al.

ECCV 2024posterarXiv:2407.11569
17
citations

Simple linear attention language models balance the recall-throughput tradeoff

Simran Arora, Sabri Eyuboglu, Michael Zhang et al.

ICML 2024spotlightarXiv:2402.18668

SparQ Attention: Bandwidth-Efficient LLM Inference

Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley et al.

ICML 2024posterarXiv:2312.04985

Sparse and Structured Hopfield Networks

Saúl Santos, Vlad Niculae, Daniel McNamee et al.

ICML 2024spotlightarXiv:2402.13725

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

Xixu Hu, Runkai Zheng, Jindong Wang et al.

ECCV 2024posterarXiv:2402.03317
5
citations

SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation

Malyaban Bal, Abhronil Sengupta

AAAI 2024paperarXiv:2308.10873
70
citations

StableMask: Refining Causal Masking in Decoder-only Transformer

Qingyu Yin, Xuzheng He, Xiang Zhuang et al.

ICML 2024posterarXiv:2402.04779

Statistical Test for Attention Maps in Vision Transformers

Tomohiro Shiraishi, Daiki Miwa, Teruyuki Katsuoka et al.

ICML 2024poster

Stripe Observation Guided Inference Cost-free Attention Mechanism

Zhongzhan Huang, Shanshan Zhong, Wushao Wen et al.

ECCV 2024poster
1
citations

Subgraphormer: Unifying Subgraph GNNs and Graph Transformers via Graph Products

Guy Bar Shalom, Beatrice Bevilacqua, Haggai Maron

ICML 2024posterarXiv:2402.08450

Tandem Transformers for Inference Efficient LLMs

Aishwarya P S, Pranav Nair, Yashas Samaga et al.

ICML 2024posterarXiv:2402.08644

TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling

Dong Huo, Zixin Guo, Xinxin Zuo et al.

ECCV 2024posterarXiv:2408.01291
19
citations

Towards Diverse Perspective Learning with Selection over Multiple Temporal Poolings

Jihyeon Seong, Jungmin Kim, Jaesik Choi

AAAI 2024paperarXiv:2403.09749
1
citations

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Siyu Zou, Jiji Tang, Yiyi Zhou et al.

AAAI 2024paperarXiv:2401.07709
19
citations

Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration

Zhengyang Zhuge, Peisong Wang, Xingting Yao et al.

ICML 2024poster

Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features

Simone Bombari, Marco Mondelli

ICML 2024posterarXiv:2402.02969

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Tri Dao, Albert Gu

ICML 2024posterarXiv:2405.21060

Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape

Juno Kim, Taiji Suzuki

ICML 2024posterarXiv:2402.01258

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Zhongzhi Yu, Zheng Wang, Yonggan Fu et al.

ICML 2024posterarXiv:2406.15765

Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations

Jan Hagnberger, Marimuthu Kalimuthu, Daniel Musekamp et al.

ICML 2024oralarXiv:2406.03919

Viewing Transformers Through the Lens of Long Convolutions Layers

Itamar Zimerman, Lior Wolf

ICML 2024poster

Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach

Yancheng Wang, Ping Li, Yingzhen Yang

ICML 2024poster

ViT-Calibrator: Decision Stream Calibration for Vision Transformer

Lin Chen, Zhijie Jia, Lechao Cheng et al.

AAAI 2024paperarXiv:2304.04354
3
citations

Wavelength-Embedding-guided Filter-Array Transformer for Spectral Demosaicing

haijin zeng, Hiep Luong, Wilfried Philips

ECCV 2024poster
1
citations

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

Xingwu Chen, Difan Zou

ICML 2024posterarXiv:2404.01601
← Previous
1...456
Next →