The Cost of Parallelizing Boosting
Abstract
We study the cost of parallelizing weak-to-strong boosting algorithms for learning, following the recent work of Karbasi and Larsen. Our main results are two-fold: - First, we prove a tight lower bound, showing that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training. Specifically, let γ be the weak learner's advantage over random guessing. The famous AdaBoost algorithm produces an accurate hypothesis by interacting with the weak learner for O(1 / γ2) rounds where each round runs in polynomial time. Karbasi and Larsen showed that "significant" parallelization must incur exponential blow-up: Any boosting algorithm either interacts with the weak learner for (1 / γ) rounds or incurs an (d / γ) blow-up in the complexity of training, where d is the VC dimension of the hypothesis class. We close the gap by showing that any boosting algorithm either has (1 / γ2) rounds of interaction or incurs a smaller exponential blow-up of (d). -Complementing our lower bound, we show that there exists a boosting algorithm using O(1/(t γ2)) rounds, and only suffer a blow-up of (d · t2). Plugging in t = ω(1), this shows that the smaller blow-up in our lower bound is tight. More interestingly, this provides the first trade-off between the parallelism and the total work required for boosting.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.