Maximizing the probability of attaining a target prior to extinction

Abstract

We present a dynamic programming-based solution to the problem of maximizing the probability of attaining a target set before hitting a cemetery set for a discrete-time Markov control process. Under mild hypotheses we establish that there exists a deterministic stationary policy that achieves the maximum value of this probability. We demonstrate how the maximization of this probability can be computed through the maximization of an expected total reward until the first hitting time to either the target or the cemetery set. Martingale characterizations of thrifty, equalizing, and optimal policies in the context of our problem are also established.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…