Spotlight "transformer architecture" Papers
6 papers found
ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition
Daolang Huang, Xinyi Wen, Ayush Bharti et al.
NeurIPS 2025spotlightarXiv:2506.07259
2
citations
FFN Fusion: Rethinking Sequential Computation in Large Language Models
Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.
NeurIPS 2025spotlightarXiv:2503.18908
2
citations
Quantum Doubly Stochastic Transformers
Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.
NeurIPS 2025spotlightarXiv:2504.16275
2
citations
Transformer brain encoders explain human high-level visual responses
Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte
NeurIPS 2025spotlightarXiv:2505.17329
4
citations
TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup
Fanxu Meng, Pingzhi Tang, Zengwei Yao et al.
NeurIPS 2025spotlight
Transolver: A Fast Transformer Solver for PDEs on General Geometries
Haixu Wu, Huakun Luo, Haowen Wang et al.
ICML 2024spotlight