NeurIPS Poster "model efficiency" Papers
2 papers found
First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training
Gyudong Kim, Hyukju Na, Jin Kim et al.
NeurIPS 2025poster
Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
Yongqiang Yao, Jingru Tan, Kaihuan Liang et al.
NeurIPS 2025poster
2
citations