Beidi Chen

18
Papers
114
Total Citations

Papers (18)

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

ICML 2025
56
citations

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention

ICLR 2024
46
citations

Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation

ICML 2025
7
citations

Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity

ICLR 2025
5
citations

LoCoCo: Dropping In Convolutions for Long Context Compression

ICML 2024
0
citations

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

ICML 2024
0
citations

HexGen: Generative Inference of Large Language Model over Heterogeneous Environment

ICML 2024
0
citations

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

ICML 2024
0
citations

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

ICML 2024
0
citations

Soft Prompt Recovers Compressed LLMs, Transferably

ICML 2024
0
citations

Fast and Accurate Stochastic Gradient Estimation

NeurIPS 2019
0
citations

Scatterbrain: Unifying Sparse and Low-rank Attention

NeurIPS 2021
0
citations

Locality Sensitive Teaching

NeurIPS 2021
0
citations

Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees

NeurIPS 2022
0
citations

Decentralized Training of Foundation Models in Heterogeneous Environments

NeurIPS 2022
0
citations

Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

NeurIPS 2023
0
citations

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

NeurIPS 2023
0
citations

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer

NeurIPS 2023
0
citations