by Wayne Xiong Papers
2 papers found
Integrative Decoding: Improving Factuality via Implicit Self-consistency
Yi Cheng, Xiao Liang, Yeyun Gong et al.
ICLR 2025posterarXiv:2410.01556
6
citations
Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Yu Fu, Zefan Cai, Abedelkadir Asi et al.
ICLR 2025poster
54
citations