Can a Weaker Player Win? Adaptive Play in Repeated Games
Abstract
Consider a two-player game repeated N times. Player 1 can choose between two styles (for interpretability, offensive and defensive), whereas Player 2 uses a single fixed style. Let X N\,:= \#wins -\#losses for Player 1 after N games, and define the match gain as E[sign(X N )], with sign(0) = 0. We assume Player 1 is weaker in the sense that each pure style is losing in expectation. Our objective is to identify under which parameter regimes Player 1 can nevertheless achieve a positive gain under an optimal adaptive policy. Using dynamic programming, we solve the finite-horizon control problem and numerically identify parameter regimes in which the optimal gain is strictly positive at some horizon N . We also derive structural conditions guaranteeing that g N is always negative, and regimes (notably with fair (D)) where g N is nonnegative for all N and can be strictly positive for every N 2. We then characterize the asymptotic behavior as N → ∞ for a weak player. In the safe case, where the defensive style induces a sure draw, the limiting gain varies continuously with the parameters and may take any value in [0, 1]. In the non-safe case, the limiting gain converges to -1 when both styles are strictly losing, and to 0 when (D) is fair (and non-safe).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.