Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation

Lillian Ratliff

Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation

Abstract

We study the role that a finite timescale separation parameter τ has on gradient descent-ascent in two-player non-convex, non-concave zero-sum games where the learning rate of player 1 is denoted by γ1 and the learning rate of player 2 is defined to be γ2=τγ1. Existing work analyzing the role of timescale separation in gradient descent-ascent has primarily focused on the edge cases of players sharing a learning rate (τ =1) and the maximizing player approximately converging between each update of the minimizing player (τ → ∞). For the parameter choice of τ=1, it is known that the learning dynamics are not guaranteed to converge to a game-theoretically meaningful equilibria in general. In contrast, Jin et al. (2020) showed that the stable critical points of gradient descent-ascent coincide with the set of strict local minmax equilibria as τ→∞. In this work, we bridge the gap between past work by showing there exists a finite timescale separation parameter τ such that x is a stable critical point of gradient descent-ascent for all τ ∈ (τ, ∞) if and only if it is a strict local minmax equilibrium. Moreover, we provide an explicit construction for computing τ along with corresponding convergence rates and results under deterministic and stochastic gradient feedback. The convergence results we present are complemented by a non-convergence result: given a critical point x that is not a strict local minmax equilibrium, then there exists a finite timescale separation τ0 such that x is unstable for all τ∈ (τ0, ∞). Finally, we empirically demonstrate on the CIFAR-10 and CelebA datasets the significant impact timescale separation has on training performance.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…