NEURIPS 2025 "gradient checkpointing" Papers
2 papers found
Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
Yongqiang Yao, Jingru Tan, Kaihuan Liang et al.
NEURIPS 2025poster
2
citations
StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs
Qijun Luo, Mengqi Li, Lei Zhao et al.
NEURIPS 2025posterarXiv:2506.03077
1
citations