2025 "multi-head attention" Papers
3 papers found
Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration
Shihao Zhou, Dayu Li, Jinshan Pan et al.
ICCV 2025posterarXiv:2503.20174
1
citations
On the Optimization and Generalization of Multi-head Attention
Christos Thrampoulidis, Rouzbeh Ghaderi, Hossein Taheri et al.
ICLR 2025posterarXiv:2310.12680
44
citations
SAS: Simulated Attention Score
Chuanyang Zheng, Jiankai Sun, Yihang Gao et al.
NEURIPS 2025posterarXiv:2507.07694
2
citations