"attention mechanisms" Papers
7 papers found
On the Role of Attention Heads in Large Language Model Safety
Zhenhong Zhou, Haiyang Yu, Xinghua Zhang et al.
ICLR 2025posterarXiv:2410.13708
40
citations
Why Does the Effective Context Length of LLMs Fall Short?
Chenxin An, Jun Zhang, Ming Zhong et al.
ICLR 2025posterarXiv:2410.18745
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model
Chenwei Xu, Yu-Chao Huang, Jerry Yao-Chieh Hu et al.
ICML 2024posterarXiv:2404.03830
Improving Interpretation Faithfulness for Vision Transformers
Lijie Hu, Yixin Liu, Ninghao Liu et al.
ICML 2024spotlight
Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models
Liqi He, Zuchao Li, Xiantao Cai et al.
AAAI 2024paperarXiv:2312.08762
34
citations
Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment
Luyao Wang, Pengnian Qi, Xigang Bao et al.
AAAI 2024paperarXiv:2403.01203
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin, Weigao Sun, Dong Li et al.
ICML 2024poster