Zhenheng Tang

5

Papers

65

Total Citations

Papers (5)

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs

ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

ParZC: Parametric Zero-Cost Proxies for Efficient NAS

Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models