Spotlight "attention mechanism" Papers

12 papers found

A Closer Look at Graph Transformers: Cross-Aggregation and Beyond

Jiaming Zhuo, Ziyi Ma, Yintong Lu et al.

NeurIPS 2025spotlight

FFN Fusion: Rethinking Sequential Computation in Large Language Models

Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.

NeurIPS 2025spotlightarXiv:2503.18908
2
citations

MoBA: Mixture of Block Attention for Long-Context LLMs

Enzhe Lu, Zhejun Jiang, Jingyuan Liu et al.

NeurIPS 2025spotlightarXiv:2502.13189
94
citations

Quantum Doubly Stochastic Transformers

Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.

NeurIPS 2025spotlightarXiv:2504.16275
2
citations

SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering

Ruimeng Liu, Xin Zou, Chang Tang et al.

NeurIPS 2025spotlight

Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few

Qishuai Wen, Zhiyuan Huang, Chun-Guang Li

NeurIPS 2025spotlightarXiv:2509.16875
1
citations

Transformer brain encoders explain human high-level visual responses

Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte

NeurIPS 2025spotlightarXiv:2505.17329
4
citations

TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup

Fanxu Meng, Pingzhi Tang, Zengwei Yao et al.

NeurIPS 2025spotlight

InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation

Jacob Si, Wendy Yusi Cheng, Michael Cooper et al.

ICML 2024spotlight

Relaxing the Accurate Imputation Assumption in Doubly Robust Learning for Debiased Collaborative Filtering

Haoxuan Li, Chunyuan Zheng, Shuyi Wang et al.

ICML 2024spotlight

Simple linear attention language models balance the recall-throughput tradeoff

Simran Arora, Sabri Eyuboglu, Michael Zhang et al.

ICML 2024spotlight

Sparse and Structured Hopfield Networks

Saúl Santos, Vlad Niculae, Daniel McNamee et al.

ICML 2024spotlight