"attention mechanism optimization" Papers
3 papers found
UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression
Chenlong Deng, Zhisong Zhang, Kelong Mao et al.
NeurIPS 2025posterarXiv:2509.15763
1
citations
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
Harry Dong, Xinyu Yang, Zhenyu Zhang et al.
ICML 2024poster
MobileNetV4: Universal Models for the Mobile Ecosystem
Danfeng Qin, Chas Leichner, Manolis Delakis et al.
ECCV 2024posterarXiv:2404.10518
407
citations