Adaptive Online Learning with Varying Norms
Abstract
Given any increasing sequence of norms \|·\|0,…,\|·\|T-1, we provide an online convex optimization algorithm that outputs points wt in some domain W in response to convex losses t:W R that guarantees regret RT(u)=Σt=1T t(wt)-t(u) O(\|u\|T-1Σt=1T \|gt\|t-1,2) where gt is a subgradient of t at wt. Our method does not require tuning to the value of u and allows for arbitrary convex W. We apply this result to obtain new "full-matrix"-style regret bounds. Along the way, we provide a new examination of the full-matrix AdaGrad algorithm, suggesting a better learning rate value that improves significantly upon prior analysis. We use our new techniques to tune AdaGrad on-the-fly, realizing our improved bound in a concrete algorithm.
0