Soft Switching Expert Policies for Controlling Systems with Uncertain Parameters

Abstract

This paper proposes a simulation-based reinforcement learning algorithm for controlling systems with uncertain and varying system parameters. While simulators are useful for safely learning control policies, the reality gap remains a major challenge. To alleviate this challenge, we propose a two-stage algorithm. First, multiple control policies are learned for systems with different system parameters in a simulator. Second, for a real system, the control policies are adaptively switched using an online convex optimization algorithm based on observations. This approach is expected to reduce learning complexity compared with existing approaches that rely on a single policy to address the reality gap.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…