Poster "attention mechanism" Papers

233 papers found • Page 2 of 5

From Attention to Activation: Unraveling the Enigmas of Large Language Models

Prannay Kaul, Chengcheng Ma, Ismail Elezi et al.

ICLR 2025posterarXiv:2410.17174
7
citations

From Softmax to Score: Transformers Can Effectively Implement In-Context Denoising Steps

Paul Rosu, Lawrence Carin, Xiang Cheng

NeurIPS 2025poster

Fully-inductive Node Classification on Arbitrary Graphs

Jianan Zhao, Zhaocheng Zhu, Mikhail Galkin et al.

ICLR 2025posterarXiv:2405.20445
12
citations

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Jerry Yao-Chieh Hu, Wei-Po Wang, Ammar Gilani et al.

ICLR 2025posterarXiv:2411.16525
18
citations

Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors

Milad Sefidgaran, Abdellatif Zaidi, Piotr Krasnowski

ICLR 2025posterarXiv:2502.15540
3
citations

Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks

Bhishma Dedhia, David Bourgin, Krishna Kumar Singh et al.

ICCV 2025posterarXiv:2503.17539
1
citations

Glance2Gaze: Efficient Vision-Language Models from Glance Fusion to Gaze Compression

Juan Chen, Honglin liu, Yingying Ao et al.

NeurIPS 2025poster

Global Regulation and Excitation via Attention Tuning for Stereo Matching

Jiahao LI, Xinhong Chen, Zhengmin JIANG et al.

ICCV 2025posterarXiv:2509.15891
2
citations

Graph-Based Attention for Differentiable MaxSAT Solving

Sota Moriyama, Katsumi Inoue

NeurIPS 2025poster

Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access

Xiang Hu, Jiaqi Leng, Jun Zhao et al.

NeurIPS 2025posterarXiv:2504.16795
2
citations

Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems

Saeed Amizadeh, Sara Abdali, Yinheng Li et al.

NeurIPS 2025posterarXiv:2509.15448

HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution

Yuxuan Jiang, Ho Man Kwan, jasmine peng et al.

CVPR 2025posterarXiv:2412.03748
10
citations

HoliTracer: Holistic Vectorization of Geographic Objects from Large-Size Remote Sensing Imagery

Yu Wang, Bo Dang, Wanchun Li et al.

ICCV 2025posterarXiv:2507.16251
1
citations

HSI: A Holistic Style Injector for Arbitrary Style Transfer

Shuhao Zhang, Hui Kang, Yang Liu et al.

CVPR 2025posterarXiv:2502.04369
1
citations

Identifying and Mitigating Position Bias of Multi-image Vision-Language Models

Xinyu Tian, Shu Zou, Zhaoyuan Yang et al.

CVPR 2025posterarXiv:2503.13792
10
citations

Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads

Zhoutong Wu, Yuan Zhang, Yiming Dong et al.

NeurIPS 2025posterarXiv:2510.16807

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Barrett Tang, Zile Huang, Chengzhi Liu et al.

ICLR 2025poster
20
citations

JAFAR: Jack up Any Feature at Any Resolution

Paul Couairon, Loïck Chambon, Louis Serrano et al.

NeurIPS 2025posterarXiv:2506.11136
7
citations

JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model

Qihao Duan, Bingding Huang, Zhenqiao Song et al.

NeurIPS 2025posterarXiv:2505.17257
1
citations

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

Yuxian Gu, Qinghao Hu, Haocheng Xi et al.

NeurIPS 2025posterarXiv:2508.15884
15
citations

KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments

Junyoung Park, Dalton Jones, Matthew Morse et al.

NeurIPS 2025posterarXiv:2504.15364
11
citations

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Chaehun Shin, Jooyoung Choi, Heeseung Kim et al.

CVPR 2025posterarXiv:2411.15466
36
citations

Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning

Yiju Guo, Wenkai Yang, Zexu Sun et al.

NeurIPS 2025posterarXiv:2506.07851
3
citations

LevAttention: Time, Space and Streaming Efficient Algorithm for Heavy Attentions

Ravindran Kannan, Chiranjib Bhattacharyya, Praneeth Kacham et al.

ICLR 2025posterarXiv:2410.05462
1
citations

Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs

Rui Dai, Sile Hu, Xu Shen et al.

ICLR 2025posterarXiv:2504.10902
6
citations

Limitations of Normalization in Attention

Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova et al.

NeurIPS 2025posterarXiv:2508.17821
2
citations

Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials

Yifan Pu, Jixuan Ying, Qixiu Li et al.

NeurIPS 2025posterarXiv:2511.00833

Long Context Tuning for Video Generation

Yuwei Guo, Ceyuan Yang, Ziyan Yang et al.

ICCV 2025posterarXiv:2503.10589
57
citations

Long-Sequence Recommendation Models Need Decoupled Embeddings

Ningya Feng, Junwei Pan, Jialong Wu et al.

ICLR 2025posterarXiv:2410.02604
11
citations

MambaIRv2: Attentive State Space Restoration

Hang Guo, Yong Guo, Yaohua Zha et al.

CVPR 2025posterarXiv:2411.15269
82
citations

Mamba Modulation: On the Length Generalization of Mamba Models

Peng Lu, Jerry Huang, QIUHAO Zeng et al.

NeurIPS 2025poster

MambaOut: Do We Really Need Mamba for Vision?

Weihao Yu, Xinchao Wang

CVPR 2025posterarXiv:2405.07992
186
citations

MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization

Rizhen Hu, Yutong He, Ran Yan et al.

NeurIPS 2025posterarXiv:2510.16415

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

ICLR 2025posterarXiv:2410.17637
19
citations

Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules

Yueqi Zhang, Peiwen Yuan, Yiwei Li et al.

NeurIPS 2025posterarXiv:2505.24292

MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Yanfeng Li, Ka-Hou Chan, Yue Sun et al.

CVPR 2025posterarXiv:2503.10112
4
citations

MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes

Feiyang Pan, Shenghe Zheng, Chunyan Yin et al.

NeurIPS 2025posterarXiv:2506.06318
2
citations

MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling

Jiaming Ma, Binwu Wang, Qihe Huang et al.

NeurIPS 2025poster

Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration

Hongkang Zhang, Shao-Lun Huang, Ercan KURUOGLU et al.

NeurIPS 2025poster

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025posterarXiv:2505.01428
5
citations

Multipole Attention for Efficient Long Context Reasoning

Coleman Hooper, Sebastian Zhao, Luca Manolache et al.

NeurIPS 2025posterarXiv:2506.13059
3
citations

Multi-turn Consistent Image Editing

Zijun Zhou, Yingying Deng, Xiangyu He et al.

ICCV 2025posterarXiv:2505.04320
5
citations

MV-SSM: Multi-View State Space Modeling for 3D Human Pose Estimation

Aviral Chharia, Wenbo Gou, Haoye Dong

CVPR 2025posterarXiv:2509.00649
3
citations

MVTrajecter: Multi-View Pedestrian Tracking with Trajectory Motion Cost and Trajectory Appearance Cost

Taiga Yamane, Ryo Masumura, Satoshi Suzuki et al.

ICCV 2025posterarXiv:2509.01157

Neural Attention Search

Difan Deng, Marius Lindauer

NeurIPS 2025posterarXiv:2502.13251

Neural networks on Symmetric Spaces of Noncompact Type

Xuan Son Nguyen, Yang, Aymeric Histace

ICLR 2025posterarXiv:2601.01097
1
citations

O-MaMa: Learning Object Mask Matching between Egocentric and Exocentric Views

Lorenzo Mur-Labadia, Maria Santos-Villafranca, Jesus Bermudez-cameo et al.

ICCV 2025posterarXiv:2506.06026
3
citations

One-Minute Video Generation with Test-Time Training

Jiarui Xu, Shihao Han, Karan Dalal et al.

CVPR 2025posterarXiv:2504.05298
66
citations

On the Optimization and Generalization of Multi-head Attention

Christos Thrampoulidis, Rouzbeh Ghaderi, Hossein Taheri et al.

ICLR 2025posterarXiv:2310.12680
44
citations

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025posterarXiv:2501.12381
3
citations