"attention mechanism" Papers
143 papers found • Page 1 of 3
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Yoad Tewel, Rinon Gal, Dvir Samuel et al.
Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment
Hua Ye, Hang Ding, Siyuan Chen et al.
Attention Mechanism, Max-Affine Partition, and Universal Approximation
Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.
Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations
Marcin Przewięźlikowski, Randall Balestriero, Wojciech Jasiński et al.
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Kazuki Irie, Morris Yau, Samuel J Gershman
Block-Attention for Efficient Prefilling
Dongyang Ma, Yan Wang, Tian Lan
CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching
Xingjian Wu, Xiangfei Qiu, Zhengyu Li et al.
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
Hao Yu, Tangyu Jiang, Shuning Jia et al.
Constraint-Aware Feature Learning for Parametric Point Cloud
Xi Cheng, Ruiqi Lei, Di Huang et al.
CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning
Yifei Zhang, Hao Zhu, Junhao Dong et al.
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
Saeed Ranjbar Alvar, Gursimran Singh, Mohammad Akbari et al.
EDCFlow: Exploring Temporally Dense Difference Maps for Event-based Optical Flow Estimation
Daikun Liu, Lei Cheng, Teng Wang et al.
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Jingcheng Deng, Zihao Wei, Liang Pang et al.
FFN Fusion: Rethinking Sequential Computation in Large Language Models
Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai, Jianqiao Lu, Yao Luo et al.
Glance2Gaze: Efficient Vision-Language Models from Glance Fusion to Gaze Compression
Juan Chen, Honglin liu, Yingying Ao et al.
Graph-Based Attention for Differentiable MaxSAT Solving
Sota Moriyama, Katsumi Inoue
HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution
Yuxuan Jiang, Ho Man Kwan, jasmine peng et al.
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Zhoutong Wu, Yuan Zhang, Yiming Dong et al.
Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
Barrett Tang, Zile Huang, Chengzhi Liu et al.
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Haocheng Xi et al.
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai, Sile Hu, Xu Shen et al.
Light3R-SfM: Towards Feed-forward Structure-from-Motion
Sven Elflein, Qunjie Zhou, Laura Leal-Taixe
Long-Sequence Recommendation Models Need Decoupled Embeddings
Ningya Feng, Junwei Pan, Jialong Wu et al.
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling
Jiaming Ma, Binwu Wang, Qihe Huang et al.
Multipole Attention for Efficient Long Context Reasoning
Coleman Hooper, Sebastian Zhao, Luca Manolache et al.
Multi-turn Consistent Image Editing
Zijun Zhou, Yingying Deng, Xiangyu He et al.
MVTrajecter: Multi-View Pedestrian Tracking with Trajectory Motion Cost and Trajectory Appearance Cost
Taiga Yamane, Ryo Masumura, Satoshi Suzuki et al.
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.
SAS: Simulated Attention Score
Chuanyang Zheng, Jiankai Sun, Yihang Gao et al.
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding
feilong tang, Chengzhi Liu, Zhongxing Xu et al.
SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering
Ruimeng Liu, Xin Zou, Chang Tang et al.
Spiking Vision Transformer with Saccadic Attention
Shuai Wang, Malu Zhang, Dehao Zhang et al.
SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition
Zeqi Zheng, Yanchen Huang, Yingchao Yu et al.
Systematic Outliers in Large Language Models
Yongqi An, Xu Zhao, Tao Yu et al.
Text to Sketch Generation with Multi-Styles
Tengjie Li, Shikui Tu, Lei Xu
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
Mohan Xu, Kai Li, Guo Chen et al.
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.
TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving
Yanping Fu, Xinyuan Liu, Tianyu Li et al.
Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
Qishuai Wen, Zhiyuan Huang, Chun-Guang Li
Transformer brain encoders explain human high-level visual responses
Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte
TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup
Fanxu Meng, Pingzhi Tang, Zengwei Yao et al.
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms
Baran Hashemi, Kurt Pasque, Chris Teska et al.
URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration
Rui Xu, Yuzhen Niu, Yuezhou Li et al.
VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing
Xiangpeng Yang, Linchao Zhu, Hehe Fan et al.
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.
ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding
Jialiang Kang, Han Shu, Wenshuo Li et al.
A decoder-only foundation model for time-series forecasting
Abhimanyu Das, Weihao Kong, Rajat Sen et al.
A Fixed-Point Approach for Causal Generative Modeling
Meyer Scetbon, Joel Jennings, Agrin Hilmkil et al.