"pre-training" Papers
3 papers found
Conference
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
Apoorv Khandelwal, Tian Yun, Nihal V. Nayak et al.
COLM 2025paperarXiv:2410.23261
6
citations
Efficient Construction of Model Family through Progressive Training Using Model Expansion
Kazuki Yano, Sho Takase, Sosuke Kobayashi et al.
COLM 2025paperarXiv:2504.00623
5
citations
Spike No More: Stabilizing the Pre-training of Large Language Models
Sho Takase, Shun Kiyono, Sosuke Kobayashi et al.
COLM 2025paperarXiv:2312.16903
31
citations