"attention mechanism" Papers

143 papers found • Page 1 of 3

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Yoad Tewel, Rinon Gal, Dvir Samuel et al.

ICLR 2025posterarXiv:2411.07232
34
citations

Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment

Hua Ye, Hang Ding, Siyuan Chen et al.

NeurIPS 2025posterarXiv:2511.08399

Attention Mechanism, Max-Affine Partition, and Universal Approximation

Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.

NeurIPS 2025posterarXiv:2504.19901
6
citations

Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations

Marcin Przewięźlikowski, Randall Balestriero, Wojciech Jasiński et al.

ICCV 2025posterarXiv:2412.03215
4
citations

Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers

Kazuki Irie, Morris Yau, Samuel J Gershman

NeurIPS 2025posterarXiv:2506.00744
6
citations

Block-Attention for Efficient Prefilling

Dongyang Ma, Yan Wang, Tian Lan

ICLR 2025posterarXiv:2409.15355
14
citations

CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching

Xingjian Wu, Xiangfei Qiu, Zhengyu Li et al.

ICLR 2025posterarXiv:2410.12261
59
citations

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Hao Yu, Tangyu Jiang, Shuning Jia et al.

CVPR 2025posterarXiv:2506.03737
3
citations

Constraint-Aware Feature Learning for Parametric Point Cloud

Xi Cheng, Ruiqi Lei, Di Huang et al.

ICCV 2025posterarXiv:2411.07747
3
citations

CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning

Yifei Zhang, Hao Zhu, Junhao Dong et al.

NeurIPS 2025poster

DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models

Saeed Ranjbar Alvar, Gursimran Singh, Mohammad Akbari et al.

CVPR 2025posterarXiv:2503.02175
48
citations

EDCFlow: Exploring Temporally Dense Difference Maps for Event-based Optical Flow Estimation

Daikun Liu, Lei Cheng, Teng Wang et al.

CVPR 2025posterarXiv:2506.03512
3
citations

Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models

Jingcheng Deng, Zihao Wei, Liang Pang et al.

ICLR 2025posterarXiv:2405.15349
6
citations

FFN Fusion: Rethinking Sequential Computation in Large Language Models

Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.

NeurIPS 2025spotlightarXiv:2503.18908
2
citations

FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

Xunhao Lai, Jianqiao Lu, Yao Luo et al.

ICLR 2025posterarXiv:2502.20766
51
citations

Glance2Gaze: Efficient Vision-Language Models from Glance Fusion to Gaze Compression

Juan Chen, Honglin liu, Yingying Ao et al.

NeurIPS 2025poster

Graph-Based Attention for Differentiable MaxSAT Solving

Sota Moriyama, Katsumi Inoue

NeurIPS 2025poster

HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution

Yuxuan Jiang, Ho Man Kwan, jasmine peng et al.

CVPR 2025posterarXiv:2412.03748
10
citations

Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads

Zhoutong Wu, Yuan Zhang, Yiming Dong et al.

NeurIPS 2025posterarXiv:2510.16807

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Barrett Tang, Zile Huang, Chengzhi Liu et al.

ICLR 2025poster
20
citations

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

Yuxian Gu, Qinghao Hu, Haocheng Xi et al.

NeurIPS 2025posterarXiv:2508.15884
15
citations

Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs

Rui Dai, Sile Hu, Xu Shen et al.

ICLR 2025posterarXiv:2504.10902
6
citations

Light3R-SfM: Towards Feed-forward Structure-from-Motion

Sven Elflein, Qunjie Zhou, Laura Leal-Taixe

CVPR 2025highlightarXiv:2501.14914
27
citations

Long-Sequence Recommendation Models Need Decoupled Embeddings

Ningya Feng, Junwei Pan, Jialong Wu et al.

ICLR 2025posterarXiv:2410.02604
11
citations

MoBA: Mixture of Block Attention for Long-Context LLMs

Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.

NeurIPS 2025spotlightarXiv:2502.13189
94
citations

MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling

Jiaming Ma, Binwu Wang, Qihe Huang et al.

NeurIPS 2025poster

Multipole Attention for Efficient Long Context Reasoning

Coleman Hooper, Sebastian Zhao, Luca Manolache et al.

NeurIPS 2025posterarXiv:2506.13059
3
citations

Multi-turn Consistent Image Editing

Zijun Zhou, Yingying Deng, Xiangyu He et al.

ICCV 2025posterarXiv:2505.04320
5
citations

MVTrajecter: Multi-View Pedestrian Tracking with Trajectory Motion Cost and Trajectory Appearance Cost

Taiga Yamane, Ryo Masumura, Satoshi Suzuki et al.

ICCV 2025posterarXiv:2509.01157

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025posterarXiv:2501.12381
3
citations

SAS: Simulated Attention Score

Chuanyang Zheng, Jiankai Sun, Yihang Gao et al.

NeurIPS 2025posterarXiv:2507.07694
2
citations

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

feilong tang, Chengzhi Liu, Zhongxing Xu et al.

CVPR 2025posterarXiv:2505.16652
22
citations

SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering

Ruimeng Liu, Xin Zou, Chang Tang et al.

NeurIPS 2025spotlight

Spiking Vision Transformer with Saccadic Attention

Shuai Wang, Malu Zhang, Dehao Zhang et al.

ICLR 2025oralarXiv:2502.12677
15
citations

SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition

Zeqi Zheng, Yanchen Huang, Yingchao Yu et al.

ICCV 2025posterarXiv:2503.15986
1
citations

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025posterarXiv:2502.06415
15
citations

Text to Sketch Generation with Multi-Styles

Tengjie Li, Shikui Tu, Lei Xu

NeurIPS 2025posterarXiv:2511.04123

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Mohan Xu, Kai Li, Guo Chen et al.

ICLR 2025oralarXiv:2410.01469
11
citations

Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels

Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.

NeurIPS 2025posterarXiv:2503.14376
8
citations

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

Yanping Fu, Xinyuan Liu, Tianyu Li et al.

NeurIPS 2025posterarXiv:2505.17771
4
citations

Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few

Qishuai Wen, Zhiyuan Huang, Chun-Guang Li

NeurIPS 2025spotlightarXiv:2509.16875
1
citations

Transformer brain encoders explain human high-level visual responses

Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte

NeurIPS 2025spotlightarXiv:2505.17329
4
citations

TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup

Fanxu Meng, Pingzhi Tang, Zengwei Yao et al.

NeurIPS 2025spotlight

Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms

Baran Hashemi, Kurt Pasque, Chris Teska et al.

NeurIPS 2025posterarXiv:2505.17190
4
citations

URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration

Rui Xu, Yuzhen Niu, Yuezhou Li et al.

CVPR 2025posterarXiv:2505.23068
4
citations

VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing

Xiangpeng Yang, Linchao Zhu, Hehe Fan et al.

ICLR 2025posterarXiv:2502.17258
31
citations

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models

Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.

CVPR 2025posterarXiv:2412.18609
1
citations

ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding

Jialiang Kang, Han Shu, Wenshuo Li et al.

NeurIPS 2025posterarXiv:2509.15235
2
citations

A decoder-only foundation model for time-series forecasting

Abhimanyu Das, Weihao Kong, Rajat Sen et al.

ICML 2024oral

A Fixed-Point Approach for Causal Generative Modeling

Meyer Scetbon, Joel Jennings, Agrin Hilmkil et al.

ICML 2024poster
← PreviousNext →