2025 by Yao Luo Papers
3 papers found
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai, Jianqiao Lu, Yao Luo et al.
ICLR 2025posterarXiv:2502.20766
51
citations
Model Merging in Pre-training of Large Language Models
Yunshui Li, Yiyuan Ma, Shen Yan et al.
NEURIPS 2025posterarXiv:2505.12082
Why Does the Effective Context Length of LLMs Fall Short?
Chenxin An, Jun Zhang, Ming Zhong et al.
ICLR 2025posterarXiv:2410.18745