An axiomatic approach to Markov decision processes

Abstract

This paper presents an axiomatic approach to finite Markov decision processes where the discount rate is zero. One of the principal difficulties in the no discounting case is that, even if attention is restricted to stationary policies, a strong overtaking optimal policy need not exists. We provide preference foundations for two criteria that do admit optimal policies: 0-discount optimality and average overtaking optimality. As a corollary of our results, we obtain conditions on a decision maker's preferences which ensure that an optimal policy exists. These results have implications for disciplines where stochastic dynamic programming problems arise, including automatic control, dynamic games, and economic development.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…