"attention mechanism" Papers

380 papers found • Page 1 of 8

Absence Bench: Language Models Can’t See What’s Missing

Harvey Yiyun Fu, Aryan Shrivastava, Jared Moore et al.

NEURIPS 2025spotlight

Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data

Tianyi Chen, Pengxiao Lin, Zhiwei Wang et al.

NEURIPS 2025spotlightarXiv:2509.17514

A Closer Look at Graph Transformers: Cross-Aggregation and Beyond

Jiaming Zhuo, Ziyi Ma, Yintong Lu et al.

NEURIPS 2025spotlight

Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers

Quoc-Vinh Lai-Dang, Taemin Kang, Seungah Son

ICLR 2025poster

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Yoad Tewel, Rinon Gal, Dvir Samuel et al.

ICLR 2025posterarXiv:2411.07232
34
citations

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Xianrui Li, Yufei Cui, Jun Li et al.

CVPR 2025highlightarXiv:2505.10649

Advancing Spiking Neural Networks Towards Multiscale Spatiotemporal Interaction Learning

Yimeng Shan, Malu Zhang, Rui-jie Zhu et al.

AAAI 2025paperarXiv:2405.13672
12
citations

Adversarial Attention Perturbations for Large Object Detection Transformers

Zachary Yahn, Selim Tekin, Fatih Ilhan et al.

ICCV 2025posterarXiv:2508.02987
2
citations

A Hidden Stumbling Block in Generalized Category Discovery: Distracted Attention

Qiyu Xu, Zhanxuan Hu, Yu Duan et al.

ICCV 2025posterarXiv:2507.14315
2
citations

A Lightweight UDF Learning Framework for 3D Reconstruction Based on Local Shape Functions

Jiangbei Hu, Yanggeng Li, Fei Hou et al.

CVPR 2025posterarXiv:2407.01330
3
citations

Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment

Hua Ye, Hang Ding, Siyuan Chen et al.

NEURIPS 2025posterarXiv:2511.08399

AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Xinghui Li, Qichao Sun, Pengze Zhang et al.

CVPR 2025posterarXiv:2412.04146
7
citations

A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention

Heejun Lee, Geon Park, Youngwan Lee et al.

ICLR 2025posterarXiv:2406.09827
8
citations

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025posterarXiv:2406.05816
10
citations

Attention (as Discrete-Time Markov) Chains

Yotam Erel, Olaf Dünkel, Rishabh Dabral et al.

NEURIPS 2025posterarXiv:2507.17657
1
citations

Attention-based clustering

Rodrigo Maulen Soto, Pierre Marion, Claire Boyer

NEURIPS 2025posterarXiv:2505.13112
1
citations

Attention layers provably solve single-location regression

Pierre Marion, Raphaël Berthier, Gérard Biau et al.

ICLR 2025posterarXiv:2410.01537
10
citations

Attention Mechanism, Max-Affine Partition, and Universal Approximation

Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.

NEURIPS 2025posterarXiv:2504.19901
6
citations

Attention with Markov: A Curious Case of Single-layer Transformers

Ashok Makkuva, Marco Bondaschi, Adway Girish et al.

ICLR 2025posterarXiv:2402.04161
39
citations

AudioGenX: Explainability on Text-to-Audio Generative Models

Hyunju Kang, Geonhee Han, Yoonjae Jeong et al.

AAAI 2025paperarXiv:2502.00459

AWRaCLe: All-Weather Image Restoration Using Visual In-Context Learning

Sudarshan Rajagopalan, Vishal M. Patel

AAAI 2025paperarXiv:2409.00263
13
citations

Balancing Conservatism and Aggressiveness: Prototype-Affinity Hybrid Network for Few-Shot Segmentation

Tianyu Zou, Shengwu Xiong, Ruilin Yao et al.

ICCV 2025posterarXiv:2507.19140
1
citations

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

Minghe Gao, Xuqi Liu, Zhongqi Yue et al.

ICCV 2025posterarXiv:2504.06606
10
citations

Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations

Marcin Przewięźlikowski, Randall Balestriero, Wojciech Jasiński et al.

ICCV 2025posterarXiv:2412.03215
4
citations

BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology

Amaya Gallagher-Syed, Henry Senior, Omnia Alwazzan et al.

CVPR 2025posterarXiv:2503.20880

Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers

Kazuki Irie, Morris Yau, Samuel J Gershman

NEURIPS 2025posterarXiv:2506.00744
6
citations

Block-Attention for Efficient Prefilling

Dongyang Ma, Yan Wang, Tian Lan

ICLR 2025posterarXiv:2409.15355
14
citations

BlockDecoder: Boosting ASR Decoders with Context and Merger Modules

Darshan Prabhu, Preethi Jyothi

NEURIPS 2025poster

Boltzmann Attention Sampling for Image Analysis with Small Objects

Theodore Zhao, Sid Kiblawi, Mu Wei et al.

CVPR 2025posterarXiv:2503.02841
2
citations

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.

CVPR 2025posterarXiv:2411.17249
4
citations

CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

Xin Liu, Jie Liu, Jie Tang et al.

CVPR 2025posterarXiv:2503.06896
24
citations

CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching

Xingjian Wu, Xiangfei Qiu, Zhengyu Li et al.

ICLR 2025posterarXiv:2410.12261
59
citations

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

Zheng Chong, Xiao Dong, Haoxiang Li et al.

ICLR 2025posterarXiv:2407.15886
67
citations

Class Distribution-induced Attention Map for Open-vocabulary Semantic Segmentations

Dong Un Kang, Hayeon Kim, Se Young Chun

ICLR 2025poster

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Songhua Liu, Zhenxiong Tan, Xinchao Wang

NEURIPS 2025posterarXiv:2412.16112
20
citations

Composing Linear Layers from Irreducibles

Travis Pence, Daisuke Yamada, Vikas Singh

NEURIPS 2025posterarXiv:2507.11688

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Hao Yu, Tangyu Jiang, Shuning Jia et al.

CVPR 2025posterarXiv:2506.03737
3
citations

Constraint-Aware Feature Learning for Parametric Point Cloud

Xi Cheng, Ruiqi Lei, Di Huang et al.

ICCV 2025posterarXiv:2411.07747
3
citations

ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval

Eric Xing, Pranavi Kolouju, Robert Pless et al.

CVPR 2025posterarXiv:2505.20764
2
citations

Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis

Byung Hyun Lee, Wongi Jeong, Woojae Han et al.

ICCV 2025posterarXiv:2507.02395
2
citations

Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models

Hector Pasten, Felipe Urrutia, Hector Orellana et al.

NEURIPS 2025posterarXiv:2505.10606
1
citations

Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation

Tuna Meral, Enis Simsar, Federico Tombari et al.

ICCV 2025highlightarXiv:2403.19776
5
citations

CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference

Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi

ICLR 2025posterarXiv:2310.10845
15
citations

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

Boyi Deng, Wenjie Wang, Fengbin Zhu et al.

AAAI 2025paperarXiv:2406.11497
19
citations

CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning

Yifei Zhang, Hao Zhu, Junhao Dong et al.

NEURIPS 2025poster

DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction

Junjie Zhou, Shouju Wang, Yuxia Tang et al.

CVPR 2025highlightarXiv:2503.09491
1
citations

Defining and Discovering Hyper-meta-paths for Heterogeneous Hypergraphs

Yaming Yang, Ziyu Zheng, Weigang Lu et al.

NEURIPS 2025poster

Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction

Jeffrey Willette, Heejun Lee, Sung Ju Hwang

NEURIPS 2025posterarXiv:2505.11254
2
citations

Dependency Parsing is More Parameter-Efficient with Normalization

Paolo Gajo, Domenic Rosati, Hassan Sajjad et al.

NEURIPS 2025posterarXiv:2505.20215

Devil is in the Detail: Towards Injecting Fine Details of Image Prompt in Image Generation via Conflict-free Guidance and Stratified Attention

Kyungmin Jo, Jooyeol Yun, Jaegul Choo

CVPR 2025posterarXiv:2508.02004
2
citations
Previous
123...8
Next