Online Min-Max Optimization: From Individual Regrets to Cumulative Saddle Points
Abstract
We propose and study an online version of min-max optimization based on cumulative saddle points under a variety of performance measures beyond convex-concave settings. After first observing the incompatibility of (static) Nash equilibrium (SNE-RegT) with individual regrets even for strongly convex-strongly concave functions, we propose an alternate static duality gap (SDual-GapT) inspired by the online convex optimization (OCO) framework. We provide algorithms that, using a reduction to classic OCO problems, achieve bounds for SDual-GapT~and a novel dynamic saddle point regret (DSP-RegT), which we suggest naturally represents a min-max version of the dynamic regret in OCO. We derive our bounds for SDual-GapT~and DSP-RegT~under strong convexity-strong concavity and a min-max notion of exponential concavity (min-max EC), and in addition we establish a class of functions satisfying min-max EC~that captures a two-player variant of the classic portfolio selection problem. Finally, for a dynamic notion of regret compatible with individual regrets, we derive bounds under a two-sided Polyak-ojasiewicz (PL) condition.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.