Dual Averaging on Compactly-Supported Distributions And Application to No-Regret Learning on a Continuum
Abstract
We consider an online learning problem on a continuum. A decision maker is given a compact feasible set S, and is faced with the following sequential problem: at iteration~t, the decision maker chooses a distribution x(t) ∈ (S), then a loss function (t) : S R+ is revealed, and the decision maker incurs expected loss (t), x(t) = Es x(t) (t)(s). We view the problem as an online convex optimization problem on the space (S) of Lebesgue-continnuous distributions on S. We prove a general regret bound for the Dual Averaging method on L2(S), then prove that dual averaging with ω-potentials (a class of strongly convex regularizers) achieves sublinear regret when S is uniformly fat (a condition weaker than convexity).
0