A Stochastic GDA Method With Backtracking For Solving Nonconvex Concave Minimax Problems

Abstract

We propose a stochastic GDA (gradient descent ascent) method with backtracking (SGDA-B) to solve nonconvex-concave (NCC) minimax problems of the form: x y Σi=1N gi(xi)+f(x,y)-h(y), where h and gi for i=1,·s,N are closed, convex functions, and for some L,μ≥ 0, f is L-smooth and f(x,·) is μ-strongly concave for all x in the problem domain. We consider the stochastic setting where one only has an access to an unbiased stochastic oracle of ∇ f with a finite variance bound σ2. While most of the existing methods assume knowledge of L, μ and/or σ2, SGDA-B is agnostic to all of these problem parameters. Moreover, SGDA-B can support random block-coordinate updates. In the deterministic setting, i.e., σ2=0 and one can compute ∇ f exactly, SGDA-B can compute an ε-stationary point within O(Lκ2/ε2) and O(L3/ε4) gradient calls when μ>0 and μ=0, respectively, where κ L/μ. In the stochastic setting, i.e., σ2>0, for any p∈(0,1) and ε>0, it can compute an ε-stationary point with high probability, which requires O(L κ3 ε-4 2(1/p)) and O(L4ε-72(1/p)) stochastic oracle calls, with probability at least 1-p, when μ>0 and μ=0, respectively. To our knowledge, SGDA-B is the first GDA-type method with backtracking to solve NCC minimax problems and achieves the best complexity among the methods that are agnostic to L, μ and σ2. We also provide numerical results for SGDA-B on a distributionally robust learning problem illustrating the potential performance gains that can be achieved by SGDA-B.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…