Learning Constraints from Stochastic Partially-Observed Closed-Loop Demonstrations
Abstract
We present a method for learning unknown parametric constraints from locally-optimal input-output trajectory data. We assume the data is generated by rollouts of stochastic nonlinear dynamics, under a single state or output feedback law and initial condition but distinct noise realizations, to robustly satisfy underlying constraints despite worst-case noise outcomes. We encode the Karush-Kuhn-Tucker (KKT) conditions of this robust optimal feedback control problem within a feasibility problem to recover constraints consistent with the local optimality of the demonstrations. We prove that our constraint learning method (i) accurately recovers the demonstrator's policy, and (ii) conservatively estimates the set of policies that ensure constraint satisfaction despite worst-case noise realizations. Moreover, we perform sensitivity analysis, proving that when demonstrations are corrupted by transmission error, the inaccuracy in the learned feedback law scales linearly in the error magnitude. Empirically, our method accurately recovers unknown constraints from simulated noisy, closed-loop demonstrations generated using dynamics, both linear and nonlinear, (e.g., unicycle and quadrotor) and a range of feedback mechanisms.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.