Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs
Abstract
We study estimation of a gradient-sparse parameter vector θ* ∈ Rp, having strong gradient-sparsity s*:=\|∇G θ*\|0 on an underlying graph G. Given observations Z1,…,Zn and a smooth, convex loss function L for which θ* minimizes the population risk E[L(θ;Z1,…,Zn)], we propose to estimate θ* by a projected gradient descent algorithm that iteratively and approximately projects gradient steps onto spaces of vectors having small gradient-sparsity over low-degree spanning trees of G. We show that, under suitable restricted strong convexity and smoothness assumptions for the loss, the resulting estimator achieves the squared-error risk s*n (1+ps*) up to a multiplicative constant that is independent of G. In contrast, previous polynomial-time algorithms have only been shown to achieve this guarantee in more specialized settings, or under additional assumptions for G and/or the sparsity pattern of ∇G θ*. As applications of our general framework, we apply our results to the examples of linear models and generalized linear models with random design.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.