Learning over Forward-Invariant Policy Classes: Reinforcement Learning without Safety Concerns

Abstract

This paper proposes a safe reinforcement learning (RL) framework based on forward-invariance-induced action-space design. The control problem is cast as a Markov decision process, but instead of relying on runtime shielding or penalty-based constraints, safety is embedded directly into the action representation. Specifically, we construct a finite admissible action set in which each discrete action corresponds to a stabilizing feedback law that preserves forward invariance of a prescribed safe state set. Consequently, the RL agent optimizes policies over a safe-by-construction policy class. We validate the framework on a quadcopter hover-regulation problem under disturbance. Simulation results show that the learned policy improves closed-loop performance and switching efficiency, while all evaluated policies remain safety-preserving. The proposed formulation decouples safety assurance from performance optimization and provides a promising foundation for safe learning in nonlinear systems.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…