by Amir Abdi Papers
2 papers found
MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention
Yucheng Li, Huiqiang Jiang, Chengruidong Zhang et al.
ICML 2025oral
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Yucheng Li, Huiqiang Jiang, Qianhui Wu et al.
ICLR 2025poster
32
citations