Gray-Box Optimization using Optimism in the Face of Uncertainty

Abstract

This paper considers sequential gray-box optimization where the objective function is given as the composition of a loss function and a parametric model. Crucially, the parameters of the model are unknown and need to be iteratively estimated from noisy observations of the model outputs. This problem setup generalizes the parametric black-box optimization problem known as (contextual) stochastic linear bandit. To address the sequential gray-box optimization problem, we propose a structure-exploiting method that leverages known problem structure given in terms of the loss function and an a priori set of admissible parameters. The method is based on the principle of optimism in the face of uncertainty and trades off exploration and exploitation by minimizing a lower confidence bound on the true objective function. We provide a detailed regret analysis of the novel method, improving on state-of-the-art results for the special case of linear stochastic bandits due to the use of a recently published bound for the parameter confidence sets arising in multi-output linear least-squares estimation. Numerical examples illustrate the superior performance of structure-exploiting methods compared to structure-agnostic approaches.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…