Poster "attention mechanism" Papers

292 papers found • Page 3 of 6

Long-Sequence Recommendation Models Need Decoupled Embeddings

Ningya Feng, Junwei Pan, Jialong Wu et al.

ICLR 2025arXiv:2410.02604
12
citations

MAESTRO: Masked Encoding Set Transformer with Self-Distillation

Matthew Lee, Jaesik Kim, Matei Ionita et al.

ICLR 2025

MambaIRv2: Attentive State Space Restoration

Hang Guo, Yong Guo, Yaohua Zha et al.

CVPR 2025arXiv:2411.15269
84
citations

Mamba Modulation: On the Length Generalization of Mamba Models

Peng Lu, Jerry Huang, QIUHAO Zeng et al.

NEURIPS 2025

MambaOut: Do We Really Need Mamba for Vision?

Weihao Yu, Xinchao Wang

CVPR 2025arXiv:2405.07992
193
citations

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025
8
citations

MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization

Rizhen Hu, Yutong He, Ran Yan et al.

NEURIPS 2025arXiv:2510.16415

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

ICLR 2025arXiv:2410.17637
22
citations

Mimic In-Context Learning for Multimodal Tasks

Yuchu Jiang, Jiale Fu, chenduo hao et al.

CVPR 2025arXiv:2504.08851
9
citations

Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules

Yueqi Zhang, Peiwen Yuan, Yiwei Li et al.

NEURIPS 2025arXiv:2505.24292

Mixture of Attentions For Speculative Decoding

Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras et al.

ICLR 2025arXiv:2410.03804
14
citations

MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Yanfeng Li, Ka-Hou Chan, Yue Sun et al.

CVPR 2025arXiv:2503.10112
5
citations

MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes

Feiyang Pan, Shenghe Zheng, Chunyan Yin et al.

NEURIPS 2025arXiv:2506.06318
2
citations

MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling

Jiaming Ma, Binwu Wang, Qihe Huang et al.

NEURIPS 2025

MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models

Yifan Liu, Keyu Fan, Weihao Yu et al.

CVPR 2025arXiv:2505.15185
8
citations

Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration

Hongkang Zhang, Shao-Lun Huang, Ercan KURUOGLU et al.

NEURIPS 2025

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025arXiv:2505.01428
5
citations

Multipole Attention for Efficient Long Context Reasoning

Coleman Hooper, Sebastian Zhao, Luca Manolache et al.

NEURIPS 2025arXiv:2506.13059
3
citations

Multi-turn Consistent Image Editing

Zijun Zhou, Yingying Deng, Xiangyu He et al.

ICCV 2025arXiv:2505.04320
6
citations

MV-SSM: Multi-View State Space Modeling for 3D Human Pose Estimation

Aviral Chharia, Wenbo Gou, Haoye Dong

CVPR 2025arXiv:2509.00649
4
citations

MVTrajecter: Multi-View Pedestrian Tracking with Trajectory Motion Cost and Trajectory Appearance Cost

Taiga Yamane, Ryo Masumura, Satoshi Suzuki et al.

ICCV 2025arXiv:2509.01157

Neural Attention Search

Difan Deng, Marius Lindauer

NEURIPS 2025arXiv:2502.13251
1
citations

Neural networks on Symmetric Spaces of Noncompact Type

Xuan Son Nguyen, Yang, Aymeric Histace

ICLR 2025arXiv:2601.01097
1
citations

O-MaMa: Learning Object Mask Matching between Egocentric and Exocentric Views

Lorenzo Mur-Labadia, Maria Santos-Villafranca, Jesus Bermudez-cameo et al.

ICCV 2025arXiv:2506.06026
3
citations

One-Minute Video Generation with Test-Time Training

Jiarui Xu, Shihao Han, Karan Dalal et al.

CVPR 2025arXiv:2504.05298
67
citations

On the Optimization and Generalization of Multi-head Attention

Christos Thrampoulidis, Rouzbeh Ghaderi, Hossein Taheri et al.

ICLR 2025arXiv:2310.12680
44
citations

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025arXiv:2501.12381
3
citations

PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies

Mojtaba Nafez, Amirhossein Koochakian, Arad Maleki et al.

CVPR 2025arXiv:2506.09237
2
citations

Pinpointing Attention-Causal Communication in Language Models

Gabriel Franco, Mark Crovella

NEURIPS 2025
1
citations

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Kwanyoung Kim, Byeongsu Sim

ICCV 2025arXiv:2503.07677
1
citations

PolaFormer: Polarity-aware Linear Attention for Vision Transformers

Weikang Meng, Yadan Luo, Xin Li et al.

ICLR 2025arXiv:2501.15061
42
citations

Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity

Susav Shrestha, Bradley Settlemyer, Nikoli Dryden et al.

NEURIPS 2025arXiv:2505.14884
3
citations

Principles of Visual Tokens for Efficient Video Understanding

Xinyue Hao, Li, Shreyank Gowda et al.

ICCV 2025arXiv:2411.13626
1
citations

RANK++LETR: Learn to Rank and Optimize Candidates for Line Segment Detection

Xin Tong, Baojie Tian, Yufei Guo et al.

NEURIPS 2025

RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling

Xiuying Wei, Anunay Yadav, Razvan Pascanu et al.

NEURIPS 2025arXiv:2507.04416

ResCLIP: Residual Attention for Training-free Dense Vision-language Inference

Jinhong Deng, Yuhang Yang, Wen Li et al.

CVPR 2025arXiv:2411.15851
11
citations

Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Zhengyao Lyu, Tianlin Pan, Chenyang Si et al.

ICCV 2025arXiv:2506.07986
6
citations

Rethinking the role of frames for SE(3)-invariant crystal structure modeling

Yusei Ito, Tatsunori Taniai, Ryo Igarashi et al.

ICLR 2025arXiv:2503.02209
8
citations

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Di Liu, Meng Chen, Baotong Lu et al.

NEURIPS 2025arXiv:2409.10516
90
citations

Retrieval Head Mechanistically Explains Long-Context Factuality

Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.

ICLR 2025arXiv:2404.15574
150
citations

Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology

Wenhao Tang, Rong Qin, Heng Fang et al.

NEURIPS 2025arXiv:2506.02408
5
citations

Robust Message Embedding via Attention Flow-Based Steganography

Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.

CVPR 2025arXiv:2405.16414
5
citations

SAS: Simulated Attention Score

Chuanyang Zheng, Jiankai Sun, Yihang Gao et al.

NEURIPS 2025arXiv:2507.07694
2
citations

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

feilong tang, Chengzhi Liu, Zhongxing Xu et al.

CVPR 2025arXiv:2505.16652
25
citations

SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling

Yizhao Gao, Zhichen Zeng, DaYou Du et al.

NEURIPS 2025

See What You Are Told: Visual Attention Sink in Large Multimodal Models

Seil Kang, Jinyeong Kim, Junhyeok Kim et al.

ICLR 2025arXiv:2503.03321
61
citations

Selective Attention Improves Transformer

Yaniv Leviathan, Matan Kalman, Yossi Matias

ICLR 2025arXiv:2410.02703
21
citations

Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

ICCV 2025arXiv:2405.13337
3
citations

Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis

Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.

CVPR 2025arXiv:2503.22168
5
citations

SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition

Zeqi Zheng, Yanchen Huang, Yingchao Yu et al.

ICCV 2025arXiv:2503.15986
1
citations