Sample-Optimal Zero-Violation Safety For Continuous Control
Abstract
In this paper, we study the problem of ensuring safety with a few shots of samples for partially unknown systems. We first characterize a fundamental limit when producing safe actions is not possible due to insufficient information or samples. Then, we develop a technique that can generate provably safe actions and recovery behaviors using a minimum number of samples. In the performance analysis, we also establish Nagumos theorem - like results with relaxed assumptions, which is potentially useful in other contexts. Finally, we discuss how the proposed method can be integrated into a policy gradient algorithm to assure safety and stability with a handful of samples without stabilizing initial policies or generative models to probe safe actions.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.