Change Point Detection Approach for Online Control of Unknown Time Varying Dynamical Systems
Abstract
We propose a novel change point detection approach for online learning control with full information feedback (state, disturbance, and cost feedback) for unknown time-varying dynamical systems. We show that our algorithm can achieve a sub-linear regret with respect to the class of Disturbance Action Control (DAC) policies, which are a widely studied class of policies for online control of dynamical systems, for any sub-linear number of changes and very general class of systems: (i) matched disturbance system with general convex cost functions, (ii) general system with linear cost functions. Specifically, a (dynamic) regret of T1/5T4/5 can be achieved for these class of systems, where T is the number of changes of the underlying system and T is the duration of the control episode. That is, the change point detection approach achieves a sub-linear regret for any sub-linear number of changes, which other previous algorithms such as in minasyan2021online cannot. Numerically, we demonstrate that the change point detection approach is superior to a standard restart approach minasyan2021online and to standard online learning approaches for time-invariant dynamical systems. Our work presents the first regret guarantee for unknown time-varying dynamical systems in terms of a stronger notion of variability like the number of changes in the underlying system. The extension of our work to state and output feedback controllers is a subject of future work.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.