CVPR "flashattention compatibility" Papers
2 papers found
PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models
Dhouib Mohamed, Davide Buscaldi, Vanier Sonia et al.
CVPR 2025posterarXiv:2504.08966
15
citations
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
Cheng Yang, Yang Sui, Jinqi Xiao et al.
CVPR 2025posterarXiv:2503.18278
20
citations