Hedging in games: Faster convergence of external and swap regrets

Abstract

We consider the setting where players run the Hedge algorithm or its optimistic variant to play an n-action game repeatedly for T rounds. 1) For two-player games, we show that the regret of optimistic Hedge decays at O( 1/T 5/6 ), improving the previous bound O(1/T3/4) by Syrgkanis, Agarwal, Luo and Schapire (NIPS'15) 2) In contrast, we show that the convergence rate of vanilla Hedge is no better than (1/ T), addressing an open question posted in Syrgkanis, Agarwal, Luo and Schapire (NIPS'15). For general m-player games, we show that the swap regret of each player decays at rate O(m1/2 (n/T)3/4) when they combine optimistic Hedge with the classical external-to-internal reduction of Blum and Mansour (JMLR'07). The algorithm can also be modified to achieve the same rate against itself and a rate of O(n/T) against adversaries. Via standard connections, our upper bounds also imply faster convergence to coarse correlated equilibria in two-player games and to correlated equilibria in multiplayer games.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…