Revisiting Consensus Error: A Fine-grained Analysis of Local SGD under Second-order Data Heterogeneity

0citations
0
Citations
#2067
in NeurIPS 2025
of 5858 papers
4
Authors
4
Data Points

Abstract

Local SGD, or Federated Averaging, is one of the most widely used algorithms for distributed optimization. Although it often outperforms alternatives such as mini-batch SGD, existing theory has not fully explained this advantage under realistic assumptions about data heterogeneity. Recent work has suggested that a second-order heterogeneity assumption may suffice to justify the empirical gains of local SGD. We confirm this conjecture by establishing new upper and lower bounds on the convergence of local SGD. These bounds demonstrate how a low second-order heterogeneity, combined with third-order smoothness, enables local SGD to interpolate between heterogeneous and homogeneous regimes while maintaining communication efficiency. Our main technical contribution is a refined analysis of the consensus error, a central quantity in such results. We validate our theory with experiments on a distributed linear regression task.

Citation History

Jan 26, 2026
0
Jan 27, 2026
0
Jan 27, 2026
0
Jan 31, 2026
0