New Sufficient Conditions for Lower Bounding the Optimal Policy of a POMDP using Lehmann Precision

Abstract

This paper provides new sufficient conditions so that the optimal policy of a partially observed Markov decision process (POMDP) can be lower bounded by a myopic policy. The two new proposed conditions, namely, Lehmann precision and copositive dominance, completely fix the problems with two crucial assumptions in the well known papers of Lovejoy 1987 and Rieder 1991. For controlled sensing POMDPs, Lehmann precision exploits both convexity and monotonicity of the value function, whereas the classical Blackwell dominance only exploits convexity. Numerical examples are presented where Lehmann precision holds but Blackwell dominance does not hold, thereby illustrating the usefulness of the main result in controlled sensing applications.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…