by Yao Luo Papers
3 papers found
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai, Jianqiao Lu, Yao Luo et al.
ICLR 2025posterarXiv:2502.20766
51
citations
Model Merging in Pre-training of Large Language Models
Yunshui Li, Yiyuan Ma, Shen Yan et al.
NeurIPS 2025poster
Why Does the Effective Context Length of LLMs Fall Short?
Chenxin An, Jun Zhang, Ming Zhong et al.
ICLR 2025posterarXiv:2410.18745