When do discounted-optimal policies also optimize the gain?

Victor Boone

When do discounted-optimal policies also optimize the gain?

Abstract

In this technical note, we establish an upper-bound on the threshold on the discount factor starting from which all discounted-optimal deterministic policies are gain-optimal, that we prove to be tight on an example. To address computability issues of that theoretical threshold, we provide a weaker bound which is tractable on ergodic MDPs in polynomial time.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…