by Chenheng Zhang Papers
3 papers found
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Zhoutong Wu, Yuan Zhang, Yiming Dong et al.
NeurIPS 2025posterarXiv:2510.16807
Language Ranker: A Lightweight Ranking framework for LLM Decoding
Chenheng Zhang, Tianqi Du, Jizhe Zhang et al.
NeurIPS 2025poster
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang, Yifei Wang, Zhaoyang Liu et al.
ICLR 2025poster