"model sharding" Papers
2 papers found
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
Jialiang Cheng, Ning Gao, Yun Yue et al.
ICLR 2025posterarXiv:2412.07210
1
citations
Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization
Alexander Long, Chamin Hewa Koneputugodage, Thalaiyasingam Ajanthan et al.
NeurIPS 2025poster