by Zaifeng Pan Papers
2 papers found
KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows
Zaifeng Pan, AJJKUMAR DAHYALAL PATEL, Yipeng Shen et al.
NeurIPS 2025oral
Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding
Yue Guan, Changming Yu, Shihan Fang et al.
NeurIPS 2025poster