On Nonconvex Decentralized Gradient Descent

Abstract

Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been proposed for convex consensus optimization. However, to the behaviors or consensus nonconvex optimization, our understanding is more limited. When we lose convexity, we cannot hope our algorithms always return global solutions though they sometimes still do sometimes. Somewhat surprisingly, the decentralized consensus algorithms, DGD and Prox-DGD, retain most other properties that are known in the convex setting. In particular, when diminishing (or constant) step sizes are used, we can prove convergence to a (or a neighborhood of) consensus stationary solution under some regular assumptions. It is worth noting that Prox-DGD can handle nonconvex nonsmooth functions if their proximal operators can be computed. Such functions include SCAD and q quasi-norms, q∈[0,1). Similarly, Prox-DGD can take the constraint to a nonconvex set with an easy projection. To establish these properties, we have to introduce a completely different line of analysis, as well as modify existing proofs that were used the convex setting.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…