Value-Set Iteration: Computing Optimal Correlated Equilibria in Infinite-Horizon Multi-Player Stochastic Games

Abstract

We study the problem of computing optimal correlated equilibria (CEs) in infinite-horizon multi-player stochastic games, where correlation signals are provided over time. In this setting, optimal CEs require history-dependent policies; this poses new representational and algorithmic challenges as the number of possible histories grows exponentially with the number of time steps. We focus on computing (ε, δ)-optimal CEs -- solutions that achieve a value within ε of an optimal CE, while allowing the agents' incentive constraints to be violated by at most δ. Our main result is an algorithm that computes an (ε,δ)-optimal CE in time polynomial in 1/(εδ(1 - γ))n+1, where γ is the discount factor, and n is the number of agents. For (a slightly more general variant of) turn-based games, we further reduce the complexity to a polynomial in n. We also establish that the bi-criterion approximation is necessary by proving matching inapproximability bounds. Our technical core is a novel approach based on inducible value sets, which leverages a compact representation of history-dependent CEs through the values they induce to overcome the representational challenge. We develop the value-set iteration algorithm -- which operates by iteratively updating estimates of inducible value sets -- and characterize CEs as the greatest fixed point of the update map. Our algorithm provides a groundwork for computing optimal CEs in general multi-player stochastic settings.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…