Federated Learning: A Stochastic Approximation Approach
Abstract
This paper considers the Federated learning (FL) in a stochastic approximation (SA) framework. Here, each client i trains a local model using its dataset D(i) and periodically transmits the model parameters w(i)n to a central server, where they are aggregated into a global model parameter wn and sent back. The clients continue their training by re-initializing their local models with the global model parameters. Prior works typically assumed constant (and often identical) step sizes (learning rates) across clients for model training. As a consequence the aggregated model converges only in expectation. In this work, client-specific tapering step sizes a(i)n are used. The global model is shown to track an ODE with a forcing function equal to the weighted sum of the negative gradients of the individual clients. The weights being the limiting ratios p(i)=n ∞ a(i)na(1)n of the step sizes, where a(1)n ≥ a(i)n, ∀ n. Unlike the constant step sizes, the convergence here is with probability one. In this framework, the clients with the larger p(i) exert a greater influence on the global model than those with smaller p(i), which can be used to favor clients that have rare and uncommon data. Numerical experiments were conducted to validate the convergence and demonstrate the choice of step-sizes for regulating the influence of the clients.