Beidi Chen
10
Papers
114
Total Citations
Papers (10)
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
ICML 2025
56
citations
JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention
ICLR 2024
46
citations
Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
ICML 2025
7
citations
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
ICLR 2025
5
citations
LoCoCo: Dropping In Convolutions for Long Context Compression
ICML 2024
0
citations
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
ICML 2024
0
citations
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
ICML 2024
0
citations
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
ICML 2024
0
citations
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
ICML 2024
0
citations
Soft Prompt Recovers Compressed LLMs, Transferably
ICML 2024
0
citations