Spotlight "attention mechanism" Papers
12 papers found
A Closer Look at Graph Transformers: Cross-Aggregation and Beyond
Jiaming Zhuo, Ziyi Ma, Yintong Lu et al.
NeurIPS 2025spotlight
FFN Fusion: Rethinking Sequential Computation in Large Language Models
Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.
NeurIPS 2025spotlightarXiv:2503.18908
2
citations
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.
NeurIPS 2025spotlightarXiv:2502.13189
94
citations
Quantum Doubly Stochastic Transformers
Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.
NeurIPS 2025spotlightarXiv:2504.16275
2
citations
SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering
Ruimeng Liu, Xin Zou, Chang Tang et al.
NeurIPS 2025spotlight
Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
Qishuai Wen, Zhiyuan Huang, Chun-Guang Li
NeurIPS 2025spotlightarXiv:2509.16875
1
citations
Transformer brain encoders explain human high-level visual responses
Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte
NeurIPS 2025spotlightarXiv:2505.17329
4
citations
TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup
Fanxu Meng, Pingzhi Tang, Zengwei Yao et al.
NeurIPS 2025spotlight
InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
Jacob Si, Wendy Yusi Cheng, Michael Cooper et al.
ICML 2024spotlight
Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering
Haoxuan Li, Chunyuan Zheng, Shuyi Wang et al.
ICML 2024spotlight
Simple linear attention language models balance the recall-throughput tradeoff
Simran Arora, Sabri Eyuboglu, Michael Zhang et al.
ICML 2024spotlight
Sparse and Structured Hopfield Networks
Saúl Santos, Vlad Niculae, Daniel McNamee et al.
ICML 2024spotlight