NeurIPS "memory efficiency" Papers
2 papers found
Accurate KV Cache Eviction via Anchor Direction Projection for Efficient LLM Inference
Zijie Geng, Jie Wang, Ziqi Liu et al.
NeurIPS 2025poster
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.
NeurIPS 2025posterarXiv:2503.14376
8
citations