"resource-constrained inference" Papers
3 papers found
Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement
Gaurav Patel, Christopher M. Sandino, Behrooz Mahasseni et al.
ICLR 2025posterarXiv:2410.02147
6
citations
Gatekeeper: Improving Model Cascades Through Confidence Tuning
Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha et al.
NeurIPS 2025posterarXiv:2502.19335
4
citations
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Junyoung Park, Dalton Jones, Matthew Morse et al.
NeurIPS 2025posterarXiv:2504.15364
11
citations