NeurIPS "latency reduction" Papers
3 papers found
Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents
Qizheng Zhang, Michael Wornow, Kunle Olukotun
NeurIPS 2025posterarXiv:2506.14852
7
citations
Fourier Token Merging: Understanding and Capitalizing Frequency Domain for Efficient Image Generation
Jiesong Liu, Xipeng Shen
NeurIPS 2025poster
Think Only When You Need with Large Hybrid-Reasoning Models
Lingjie Jiang, Xun Wu, Shaohan Huang et al.
NeurIPS 2025posterarXiv:2505.14631
35
citations