Poster "attention mechanism" Papers

292 papers found • Page 1 of 6

Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers

Quoc-Vinh Lai-Dang, Taemin Kang, Seungah Son

ICLR 2025

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Yoad Tewel, Rinon Gal, Dvir Samuel et al.

ICLR 2025arXiv:2411.07232
35
citations

Adversarial Attention Perturbations for Large Object Detection Transformers

Zachary Yahn, Selim Tekin, Fatih Ilhan et al.

ICCV 2025arXiv:2508.02987
2
citations

A Hidden Stumbling Block in Generalized Category Discovery: Distracted Attention

Qiyu Xu, Zhanxuan Hu, Yu Duan et al.

ICCV 2025arXiv:2507.14315
3
citations

A Lightweight UDF Learning Framework for 3D Reconstruction Based on Local Shape Functions

Jiangbei Hu, Yanggeng Li, Fei Hou et al.

CVPR 2025arXiv:2407.01330
3
citations

Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment

Hua Ye, Hang Ding, Siyuan Chen et al.

NEURIPS 2025arXiv:2511.08399

AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Xinghui Li, Qichao Sun, Pengze Zhang et al.

CVPR 2025arXiv:2412.04146
8
citations

A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention

Heejun Lee, Geon Park, Youngwan Lee et al.

ICLR 2025arXiv:2406.09827
9
citations

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025arXiv:2406.05816
10
citations

Attention (as Discrete-Time Markov) Chains

Yotam Erel, Olaf Dünkel, Rishabh Dabral et al.

NEURIPS 2025arXiv:2507.17657
1
citations

Attention-based clustering

Rodrigo Maulen Soto, Pierre Marion, Claire Boyer

NEURIPS 2025arXiv:2505.13112
1
citations

Attention layers provably solve single-location regression

Pierre Marion, Raphaël Berthier, Gérard Biau et al.

ICLR 2025arXiv:2410.01537
11
citations

Attention Mechanism, Max-Affine Partition, and Universal Approximation

Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.

NEURIPS 2025arXiv:2504.19901
6
citations

Attention with Markov: A Curious Case of Single-layer Transformers

Ashok Makkuva, Marco Bondaschi, Adway Girish et al.

ICLR 2025arXiv:2402.04161
39
citations

Balancing Conservatism and Aggressiveness: Prototype-Affinity Hybrid Network for Few-Shot Segmentation

Tianyu Zou, Shengwu Xiong, Ruilin Yao et al.

ICCV 2025arXiv:2507.19140
1
citations

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

Minghe Gao, Xuqi Liu, Zhongqi Yue et al.

ICCV 2025arXiv:2504.06606
10
citations

Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations

Marcin Przewięźlikowski, Randall Balestriero, Wojciech Jasiński et al.

ICCV 2025arXiv:2412.03215
4
citations

BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology

Amaya Gallagher-Syed, Henry Senior, Omnia Alwazzan et al.

CVPR 2025arXiv:2503.20880

Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers

Kazuki Irie, Morris Yau, Samuel J Gershman

NEURIPS 2025arXiv:2506.00744
6
citations

Block-Attention for Efficient Prefilling

Dongyang Ma, Yan Wang, Tian Lan

ICLR 2025arXiv:2409.15355
17
citations

BlockDecoder: Boosting ASR Decoders with Context and Merger Modules

Darshan Prabhu, Preethi Jyothi

NEURIPS 2025

Boltzmann Attention Sampling for Image Analysis with Small Objects

Theodore Zhao, Sid Kiblawi, Mu Wei et al.

CVPR 2025arXiv:2503.02841
2
citations

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.

CVPR 2025arXiv:2411.17249
4
citations

CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

Xin Liu, Jie Liu, Jie Tang et al.

CVPR 2025arXiv:2503.06896
26
citations

CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching

Xingjian Wu, Xiangfei Qiu, Zhengyu Li et al.

ICLR 2025arXiv:2410.12261
68
citations

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

Zheng Chong, Xiao Dong, Haoxiang Li et al.

ICLR 2025arXiv:2407.15886
68
citations

Class Distribution-induced Attention Map for Open-vocabulary Semantic Segmentations

Dong Un Kang, Hayeon Kim, Se Young Chun

ICLR 2025

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Songhua Liu, Zhenxiong Tan, Xinchao Wang

NEURIPS 2025arXiv:2412.16112
20
citations

Composing Linear Layers from Irreducibles

Travis Pence, Daisuke Yamada, Vikas Singh

NEURIPS 2025arXiv:2507.11688

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Hao Yu, Tangyu Jiang, Shuning Jia et al.

CVPR 2025arXiv:2506.03737
4
citations

Constrained Belief Updates Explain Geometric Structures in Transformer Representations

Mateusz Piotrowski, Paul Riechers, Daniel Filan et al.

ICML 2025arXiv:2502.01954
6
citations

Constraint-Aware Feature Learning for Parametric Point Cloud

Xi Cheng, Ruiqi Lei, Di Huang et al.

ICCV 2025arXiv:2411.07747
3
citations

ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval

Eric Xing, Pranavi Kolouju, Robert Pless et al.

CVPR 2025arXiv:2505.20764
3
citations

Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis

Byung Hyun Lee, Wongi Jeong, Woojae Han et al.

ICCV 2025arXiv:2507.02395
2
citations

Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models

Hector Pasten, Felipe Urrutia, Hector Orellana et al.

NEURIPS 2025arXiv:2505.10606
1
citations

CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference

Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi

ICLR 2025arXiv:2310.10845
15
citations

CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning

Yifei Zhang, Hao Zhu, Junhao Dong et al.

NEURIPS 2025

Defining and Discovering Hyper-meta-paths for Heterogeneous Hypergraphs

Yaming Yang, Ziyu Zheng, Weigang Lu et al.

NEURIPS 2025

Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction

Jeffrey Willette, Heejun Lee, Sung Ju Hwang

NEURIPS 2025arXiv:2505.11254
3
citations

Dependency Parsing is More Parameter-Efficient with Normalization

Paolo Gajo, Domenic Rosati, Hassan Sajjad et al.

NEURIPS 2025arXiv:2505.20215

Devil is in the Detail: Towards Injecting Fine Details of Image Prompt in Image Generation via Conflict-free Guidance and Stratified Attention

Kyungmin Jo, Jooyeol Yun, Jaegul Choo

CVPR 2025arXiv:2508.02004
2
citations

Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration

Shihao Zhou, Dayu Li, Jinshan Pan et al.

ICCV 2025arXiv:2503.20174
1
citations

Differential Transformer

Tianzhu Ye, Li Dong, Yuqing Xia et al.

ICLR 2025arXiv:2410.05258

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

George Wang, Jesse Hoogland, Stan van Wingerden et al.

ICLR 2025arXiv:2410.02984
24
citations

DiffSim: Taming Diffusion Models for Evaluating Visual Similarity

Yiren Song, Xiaokang Liu, Mike Zheng Shou

ICCV 2025arXiv:2412.14580
9
citations

DIFFSSR: Stereo Image Super-resolution Using Differential Transformer

Dafeng Zhang

NEURIPS 2025

Diffusion-Based Imaginative Coordination for Bimanual Manipulation

Huilin Xu, Jian Ding, Jiakun Xu et al.

ICCV 2025arXiv:2507.11296
2
citations

Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection

Jia Guo, Shuai Lu, Weihang Zhang et al.

CVPR 2025arXiv:2405.14325
56
citations

Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation

Chanyoung Kim, Dayun Ju, Woojung Han et al.

CVPR 2025arXiv:2411.17150
10
citations

Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers

Lei Chen, Joan Bruna, Alberto Bietti

ICLR 2025arXiv:2406.03068
8
citations
Previous
123...6
Next