作者: Peter Richtárik , Konstantin Mishchenko , Ahmed Khaled
DOI:
关键词:
摘要: We provide a new analysis of local SGD, removing unnecessary assumptions and elaborating on the difference between two data regimes: identical heterogeneous. In both cases, we improve existing theory values optimal stepsize number iterations. Our bounds are based notion variance that is specific to SGD methods with different data. The tightness our results guaranteed by recovering known statements when plug $H=1$, where $H$ steps. empirical evidence further validates severe impact heterogeneity performance SGD.