Detection-averse optimal and receding-horizon control for Markov decision processes
Abstract
In this paper, we consider a Markov decision process (MDP), where the ego agent has a nominal objective to pursue while needs to hide its state from detection by an adversary. After formulating the problem, we first propose a value iteration (VI) approach to solve it. To overcome the "curse of dimensionality" and thus gain scalability to larger-sized problems, we then propose a receding-horizon optimization (RHO) approach to obtain approximate solutions. We use examples to illustrate and compare the VI and RHO approaches, and to show the potential of our problem formulation for practical applications.
0