"associative recall" Papers
7 papers found
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
Julien Siems, Timur Carstensen, Arber Zela et al.
NEURIPS 2025posterarXiv:2502.10297
23
citations
Learning Randomized Algorithms with Transformers
Johannes von Oswald, Seijin Kobayashi, Yassir Akram et al.
ICLR 2025posterarXiv:2408.10818
1
citations
Longhorn: State Space Models are Amortized Online Learners
Bo Liu, Rui Wang, Lemeng Wu et al.
ICLR 2025posterarXiv:2407.14207
29
citations
Overcoming Long Context Limitations of State Space Models via Context Dependent Sparse Attention
Zhihao Zhan, Jianan Zhao, Zhaocheng Zhu et al.
NEURIPS 2025poster
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
Kaiyue Wen, Xingyu Dang, Kaifeng Lyu
ICLR 2025posterarXiv:2402.18510
48
citations
The emergence of sparse attention: impact of data distribution and benefits of repetition
Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.
NEURIPS 2025oralarXiv:2505.17863
6
citations
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang et al.
ICLR 2025posterarXiv:2501.00658
7
citations