λ-Reachability: Geometric-Horizon Safety Bellman Equations for Humanoid Safety

Abstract

We introduce λ-Reachability, a scalable approach to Hamilton--Jacobi safety analysis for high-dimensional robotic systems. Unlike prior discounted formulations that rely on fixed one-step Bellman updates, λ-Reachability employs a stochastic multi-step estimator of the safety value, using a geometrically distributed rollout horizon together with a randomly absorbed terminal. Conceptually analogous to TD(λ), λ-Reachability interpolates between local self-consistency updates and long-horizon max-over-trajectory safety targets via an interpretable horizon-control parameter. Unlike TD(λ), where the terminal value is always incorporated in learning targets, the terminal safety value in λ-Reachability is only used at a probability controlled by parameter δ. We formally show that for δ<1, the update induces a contraction mapping that allows temporal-difference learning; as λ 1, the estimator recovers the undiscounted reachability objective. We apply λ-Reachability to high-dimensional safety learning problems with both simulated and real humanoid robots under balance and collision avoidance constraints. Experimental results demonstrate that λ-Reachability significantly improves both safe-set boundary classification and safety margin estimation compared to single-step temporal-difference baselines.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…