ICLR "memory-efficient optimization" Papers
2 papers found
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Zeman Li, Xinwei Zhang, Peilin Zhong et al.
ICLR 2025posterarXiv:2410.06441
11
citations
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang, Ziquan Zhu, Gaojie Jin et al.
ICLR 2025posterarXiv:2501.06842
15
citations