Poster "memory efficiency" Papers
9 papers found
Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering
Imad Eddine MAROUF, Enzo Tartaglione, Stéphane Lathuilière et al.
ICCV 2025posterarXiv:2502.04469
1
citations
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao, Jiaming Tang, Jingwei Zuo et al.
ICLR 2025posterarXiv:2410.10819
165
citations
SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
Muyang Li, Yujun Lin, Zhekai Zhang et al.
ICLR 2025posterarXiv:2411.05007
90
citations
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.
NeurIPS 2025posterarXiv:2503.14376
8
citations
Variational Bayesian Pseudo-Coreset
Hyungi Lee, Seungyoo Lee, Juho Lee
ICLR 2025posterarXiv:2502.21143
CHAI: Clustered Head Attention for Efficient LLM Inference
Saurabh Agarwal, Bilge Acun, Basil Hosmer et al.
ICML 2024poster
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
Donghyun Kim, Byeongho Heo, Dongyoon Han
ECCV 2024posterarXiv:2403.19588
40
citations
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski et al.
ICML 2024poster
Memory Efficient Neural Processes via Constant Memory Attention Block
Leo Feng, Frederick Tung, Hossein Hajimirsadeghi et al.
ICML 2024poster