No-Regret Algorithms for Safe Bayesian Optimization with Monotonicity Constraints

Abstract

We consider the problem of sequentially maximizing an unknown function f over a set of actions of the form (s,x), where the selected actions must satisfy a safety constraint with respect to an unknown safety function g. We model f and g as lying in a reproducing kernel Hilbert space (RKHS), which facilitates the use of Gaussian process methods. While existing works for this setting have provided algorithms that are guaranteed to identify a near-optimal safe action, the problem of attaining low cumulative regret has remained largely unexplored, with a key challenge being that expanding the safe region can incur high regret. To address this challenge, we show that if g is monotone with respect to just the single variable s (with no such constraint on f), sublinear regret becomes achievable with our proposed algorithm. In addition, we show that a modified version of our algorithm is able to attain sublinear regret (for suitably defined notions of regret) for the task of finding a near-optimal s corresponding to every x, as opposed to only finding the global safe optimum. Our findings are supported with empirical evaluations on various objective and safety functions.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…