Projection-Free Algorithms in Statistical Estimation
Abstract
Frank-Wolfe algorithm (FW) and its variants have gained a surge of interests in machine learning community due to its projection-free property. Recently people have reduced the gradient evaluation complexity of FW algorithm to (1ε) for the smooth and strongly convex objective. This complexity result is especially significant in learning problem, as the overwhelming data size makes a single evluation of gradient computational expensive. However, in high-dimensional statistical estimation problems, the objective is typically not strongly convex, and sometimes even non-convex. In this paper, we extend the state-of-the-art FW type algorithms for the large-scale, high-dimensional estimation problem. We show that as long as the objective satisfies restricted strong convexity, and we are not optimizing over statistical limit of the model, the (1ε) gradient evaluation complexity could still be attained.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.