Mikhail Burtsev
4
papers
2
total citations
papers (4)
Limitations of Normalization in Attention
NeurIPS 2025arXiv
2
citations
Beyond Attention: Breaking the Limits of Transformer Context Length with Recurrent Memory
AAAI 2024
0
citations
Recurrent Memory Transformer
NeurIPS 2022arXiv
0
citations
Explain My Surprise: Learning Efficient Long-Term Memory by predicting uncertain outcomes
NeurIPS 2022arXiv
0
citations