Poster "inference time optimization" Papers
4 papers found
Context-aware Dynamic Pruning for Speech Foundation Models
Masao Someki, Yifan Peng, Siddhant Arora et al.
ICLR 2025poster
7
citations
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
Cheng Yang, Yang Sui, Jinqi Xiao et al.
CVPR 2025posterarXiv:2503.18278
20
citations
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski et al.
ICML 2024poster
Image-adaptive 3D Lookup Tables for Real-time Image Enhancement with Bilateral Grids
Wontae Kim, Nam Ik Cho
ECCV 2024poster
7
citations