Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View

Abstract

In this work, we show existence of invariant ergodic measure for switched linear dynamical systems (SLDSs) under a norm-stability assumption of system dynamics in some unbounded subset of Rn. Consequently, given a stationary Markov control policy, we derive non-asymptotic bounds for learning expected reward (w.r.t the invariant ergodic measure our closed-loop system mixes to) from time-averages using Birkhoff's Ergodic Theorem. The presented results provide a foundation for deriving non-asymptotic analysis for average reward-based optimal control of SLDSs. Finally, we illustrate the presented theoretical results in two case-studies.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…