"large-scale model training" Papers
2 papers found
LayerLock: Non-collapsing Representation Learning with Progressive Freezing
Goker Erdogan, Nikhil Parthasarathy, Catalin Ionescu et al.
ICCV 2025posterarXiv:2509.10156
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
Hanzhen Zhao, Xingyu Xie, Cong Fang et al.
ICLR 2025poster
4
citations