A Generalized Fundamental Matrix for Computing Fundamental Quantities of Markov Systems
Abstract
As is well known, the fundamental matrix (I - P + e π)-1 plays an important role in the performance analysis of Markov systems, where P is the transition probability matrix, e is the column vector of ones, and π is the row vector of the steady state distribution. It is used to compute the performance potential (relative value function) of Markov decision processes under the average criterion, such as g=(I - P + e π)-1 f where g is the column vector of performance potentials and f is the column vector of reward functions. However, we need to pre-compute π before we can compute (I - P + e π)-1. In this paper, we derive a generalization version of the fundamental matrix as (I - P + e r)-1, where r can be any given row vector satisfying r e ≠ 0. With this generalized fundamental matrix, we can compute g=(I - P + e r)-1 f. The steady state distribution is computed as π= r(I - P + e r)-1. The Q-factors at every state-action pair can also be computed in a similar way. These formulas may give some insights on further understanding how to efficiently compute or estimate the values of g, π, and Q-factors in Markov systems, which are fundamental quantities for the performance optimization of Markov systems.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.