Blackwell optimality in risk-sensitive stochastic control

Abstract

In this paper, we consider a discrete-time Markov Decision Process (MDP) on a finite state-action space with a long-run risk-sensitive criterion used as the objective function. We discuss the concept of Blackwell optimality and comment on intricacies which arise when the risk-neutral expectation is replaced by the risk-sensitive entropy. Also, we show the relation between the Blackwell optimality and ultimate stationarity and provide an illustrative example that helps to better understand the structural difference between these two concepts.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…