Stochastic Games with Limited Public Memory

Abstract

We study the memory resources required for near-optimal play in two-player zero-sum stochastic games with the long-run average payoff. Although optimal strategies may not exist in such games, near-optimal strategies always do. Mertens and Neyman (1981) proved that in any stochastic game, for any >0, there exist uniform -optimal memory-based strategies -- i.e., strategies that are -optimal in all sufficiently long n-stage games -- that use at most O(n) memory states within the first n stages. We improve this bound on the number of memory states by proving that in any stochastic game, for any >0, there exist uniform -optimal memory-based strategies that use at most O( n) memory states in the first n stages. Moreover, we establish the existence of uniform -optimal memory-based strategies whose memory updating and action selection are time-independent and such that, with probability close to 1, for all n, the number of memory states used up to stage n is at most O( n). This result cannot be extended to strategies with bounded public memory -- even if time-dependent memory updating and action selection are allowed. This impossibility is illustrated in the Big Match -- a well-known stochastic game where the stage payoffs to Player 1 are 0 or 1. Although for any > 0, there exist strategies of Player 1 that guarantee a payoff exceeding 1/2 - in all sufficiently long n-stage games, we show that any strategy of Player 1 that uses a finite public memory fails to guarantee a payoff greater than in any sufficiently long n-stage game.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…