"attention mechanism" Papers

385 papers found • Page 3 of 8

Graph-Based Attention for Differentiable MaxSAT Solving

Sota Moriyama, Katsumi Inoue

NEURIPS 2025

Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access

Xiang Hu, Jiaqi Leng, Jun Zhao et al.

NEURIPS 2025arXiv:2504.16795
3
citations

Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization

Yeji Song, Jimyeong Kim, Wonhark Park et al.

AAAI 2025paperarXiv:2403.14155
5
citations

Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems

Saeed Amizadeh, Sara Abdali, Yinheng Li et al.

NEURIPS 2025arXiv:2509.15448

HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution

Yuxuan Jiang, Ho Man Kwan, jasmine peng et al.

CVPR 2025arXiv:2412.03748
13
citations

HoliTracer: Holistic Vectorization of Geographic Objects from Large-Size Remote Sensing Imagery

Yu Wang, Bo Dang, Wanchun Li et al.

ICCV 2025arXiv:2507.16251
1
citations

HSI: A Holistic Style Injector for Arbitrary Style Transfer

Shuhao Zhang, Hui Kang, Yang Liu et al.

CVPR 2025arXiv:2502.04369
1
citations

Identifying and Mitigating Position Bias of Multi-image Vision-Language Models

Xinyu Tian, Shu Zou, Zhaoyuan Yang et al.

CVPR 2025arXiv:2503.13792
11
citations

Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

CVPR 2025arXiv:2503.15404
5
citations

Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads

Zhoutong Wu, Yuan Zhang, Yiming Dong et al.

NEURIPS 2025arXiv:2510.16807

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Barrett Tang, Zile Huang, Chengzhi Liu et al.

ICLR 2025
20
citations

Intra and Inter Parser-Prompted Transformers for Effective Image Restoration

Cong Wang, Jinshan Pan, Liyan Wang et al.

AAAI 2025paperarXiv:2503.14037
5
citations

JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics

Yuanchuan Guo, Jun Liu, Huimin Cheng et al.

NEURIPS 2025
1
citations

JAFAR: Jack up Any Feature at Any Resolution

Paul Couairon, Loïck Chambon, Louis Serrano et al.

NEURIPS 2025arXiv:2506.11136
7
citations

JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model

Qihao Duan, Bingding Huang, Zhenqiao Song et al.

NEURIPS 2025arXiv:2505.17257
3
citations

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

Yuxian Gu, Qinghao Hu, Haocheng Xi et al.

NEURIPS 2025arXiv:2508.15884
16
citations

KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments

Junyoung Park, Dalton Jones, Matthew Morse et al.

NEURIPS 2025arXiv:2504.15364
17
citations

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Chaehun Shin, Jooyoung Choi, Heeseung Kim et al.

CVPR 2025arXiv:2411.15466
37
citations

Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning

Yiju Guo, Wenkai Yang, Zexu Sun et al.

NEURIPS 2025arXiv:2506.07851
4
citations

LevAttention: Time, Space and Streaming Efficient Algorithm for Heavy Attentions

Ravindran Kannan, Chiranjib Bhattacharyya, Praneeth Kacham et al.

ICLR 2025arXiv:2410.05462
2
citations

Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs

Rui Dai, Sile Hu, Xu Shen et al.

ICLR 2025arXiv:2504.10902
9
citations

Light3R-SfM: Towards Feed-forward Structure-from-Motion

Sven Elflein, Qunjie Zhou, Laura Leal-Taixe

CVPR 2025highlightarXiv:2501.14914
26
citations

Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval

Jiaxing Li, Lin Jiang, Zeqi Ma et al.

AAAI 2025paperarXiv:2502.19751
2
citations

Limitations of Normalization in Attention

Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova et al.

NEURIPS 2025arXiv:2508.17821
2
citations

Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials

Yifan Pu, Jixuan Ying, Qixiu Li et al.

NEURIPS 2025arXiv:2511.00833
1
citations

Long Context Tuning for Video Generation

Yuwei Guo, Ceyuan Yang, Ziyan Yang et al.

ICCV 2025arXiv:2503.10589
60
citations

Long-Sequence Recommendation Models Need Decoupled Embeddings

Ningya Feng, Junwei Pan, Jialong Wu et al.

ICLR 2025arXiv:2410.02604
12
citations

Lost in Transmission: When and Why LLMs Fail to Reason Globally

Tobias Schnabel, Kiran Tomlinson, Adith Swaminathan et al.

NEURIPS 2025spotlightarXiv:2505.08140
4
citations

Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation

Gao Peng, Le Zhuo, Dongyang Liu et al.

ICLR 2025oral
9
citations

MAESTRO: Masked Encoding Set Transformer with Self-Distillation

Matthew Lee, Jaesik Kim, Matei Ionita et al.

ICLR 2025

MambaIRv2: Attentive State Space Restoration

Hang Guo, Yong Guo, Yaohua Zha et al.

CVPR 2025arXiv:2411.15269
84
citations

Mamba Modulation: On the Length Generalization of Mamba Models

Peng Lu, Jerry Huang, QIUHAO Zeng et al.

NEURIPS 2025

MambaOut: Do We Really Need Mamba for Vision?

Weihao Yu, Xinchao Wang

CVPR 2025arXiv:2405.07992
192
citations

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025
8
citations

MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization

Rizhen Hu, Yutong He, Ran Yan et al.

NEURIPS 2025arXiv:2510.16415

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

ICLR 2025arXiv:2410.17637
22
citations

Mimic In-Context Learning for Multimodal Tasks

Yuchu Jiang, Jiale Fu, chenduo hao et al.

CVPR 2025arXiv:2504.08851
9
citations

Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules

Yueqi Zhang, Peiwen Yuan, Yiwei Li et al.

NEURIPS 2025arXiv:2505.24292

Mixture of Attentions For Speculative Decoding

Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras et al.

ICLR 2025arXiv:2410.03804
14
citations

MoBA: Mixture of Block Attention for Long-Context LLMs

Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.

NEURIPS 2025spotlightarXiv:2502.13189
109
citations

MoEdit: On Learning Quantity Perception for Multi-object Image Editing

Yanfeng Li, Ka-Hou Chan, Yue Sun et al.

CVPR 2025arXiv:2503.10112
5
citations

MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes

Feiyang Pan, Shenghe Zheng, Chunyan Yin et al.

NEURIPS 2025arXiv:2506.06318
2
citations

MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling

Jiaming Ma, Binwu Wang, Qihe Huang et al.

NEURIPS 2025

MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models

Yifan Liu, Keyu Fan, Weihao Yu et al.

CVPR 2025arXiv:2505.15185
8
citations

Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration

Hongkang Zhang, Shao-Lun Huang, Ercan KURUOGLU et al.

NEURIPS 2025

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025arXiv:2505.01428
5
citations

Multipole Attention for Efficient Long Context Reasoning

Coleman Hooper, Sebastian Zhao, Luca Manolache et al.

NEURIPS 2025arXiv:2506.13059
3
citations

Multi-turn Consistent Image Editing

Zijun Zhou, Yingying Deng, Xiangyu He et al.

ICCV 2025arXiv:2505.04320
6
citations

MV-SSM: Multi-View State Space Modeling for 3D Human Pose Estimation

Aviral Chharia, Wenbo Gou, Haoye Dong

CVPR 2025arXiv:2509.00649
4
citations

MVTrajecter: Multi-View Pedestrian Tracking with Trajectory Motion Cost and Trajectory Appearance Cost

Taiga Yamane, Ryo Masumura, Satoshi Suzuki et al.

ICCV 2025arXiv:2509.01157