Poster "large language model training" Papers
2 papers found
COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training
Haocheng Xi, Han Cai, Ligeng Zhu et al.
ICLR 2025posterarXiv:2410.19313
19
citations
Understanding the Training Speedup from Sampling with Approximate Losses
Rudrajit Das, Xi Chen, Bertram Ieong et al.
ICML 2024posterarXiv:2402.07052