ICLR "llm pre-training" Papers
2 papers found
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Yuda Song, Hanlin Zhang, Carson Eisenach et al.
ICLR 2025posterarXiv:2412.02674
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang, Ziquan Zhu, Gaojie Jin et al.
ICLR 2025posterarXiv:2501.06842
15
citations