Occupation measures arising in finite stochastic games
Abstract
Shapley (1953) introduced two-player zero-sum discounted stochastic games, henceforth stochastic games, a model where a state variable follows a two-controlled Markov chain, the players receive rewards at each stage which add up to 0, and each maximizes the normalized -discounted sum of stage rewards, for some fixed discount rate ∈(0,1]. In this paper, we study asymptotic occupation measures arising in these games, as the discount rate goes to 0.
0