Poster "attention mechanism alternatives" Papers
2 papers found
Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
Costin-Andrei Oncescu, Sanket Jayant Purandare, Stratos Idreos et al.
ICLR 2025posterarXiv:2410.12982
2
citations
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Jason Ramapuram, Federico Danieli, Eeshan Gunesh Dhekane et al.
ICLR 2025posterarXiv:2409.04431
34
citations