PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RL

Abstract

A central challenge in multi-task reinforcement learning (RL) is to train generalist policies capable of performing tasks not seen during training. To facilitate such generalization, linear temporal logic (LTL) has emerged as a powerful formalism for specifying structured, temporally extended tasks to RL agents. While existing approaches to LTL-guided multi-task RL demonstrate generalization across LTL specifications, they are unable to generalize to unseen vocabularies of propositions (or "symbols"), which describe high-level events in LTL. We present PlatoLTL, a novel approach that enables policies to zero-shot generalize not only compositionally across LTL structures, but also parametrically across propositions. We model propositions as parameterized instances of atomic predicates, allowing policies to learn shared structure across related propositions. We propose a novel architecture that embeds and composes parameterized propositions to represent LTL formulae, and demonstrate zero-shot generalization in a range of challenging environments.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…