Quantum speedups for convex dynamic programming
Abstract
We present a quantum algorithm to solve dynamic programming problems with convex value functions. For linear discrete-time systems with a d-dimensional state space of size N, the proposed algorithm outputs a quantum-mechanical representation of the value function in time O(T γdTpolylog(N,(T/)d)), where is the accuracy of the solution, T is the time horizon, and γ is a problem-specific parameter depending on the condition numbers of the cost functions. This allows us to evaluate the value function at any fixed state in time O(T γdTN\,polylog(N,(T/)d)), and the corresponding optimal action can be recovered by solving a convex program. The class of optimization problems to which our algorithm can be applied includes provably hard stochastic dynamic programs. Finally, we show that the algorithm obtains a quadratic speedup (up to polylogarithmic factors) compared to the classical Bellman approach on some dynamic programs with continuous state space that have γ=1.