First-Order Methods for Wasserstein Distributionally Robust MDP

Christian Kroer

First-Order Methods for Wasserstein Distributionally Robust MDP

Abstract

Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for ambiguity sets which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via first-order methods, and instantiate it for several types of Wasserstein ambiguity sets. By developing efficient proximal updates, our algorithms achieve a convergence rate of O(NA2.5S3.5(S)(ε-1)ε-1.5 ) for the number of kernels N in the support of the nominal distribution, states S, and actions A; this rate varies slightly based on the Wasserstein setup. Our dependence on N,A and S is significantly better than existing methods, which have a complexity of O(N3.5A3.5S4.52(ε-1) ). Numerical experiments show that our algorithm is significantly more scalable than state-of-the-art approaches across several domains.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…