Nearly Instance Optimal Sparse Matrix Approximation from Matrix-Vector Products
Abstract
A large body of work studies the problem of learning an approximation to an implicit matrix A∈ Rm× n that is only accessible implicitly via matrix-vector product queries (matvec queries) of the form x → Ax or x → ATx. Of particular interest are methods that learn a near-optimal approximation with a fixed sparsity pattern. For example, we might want to learn a near-optimal diagonal, banded, or arrow-head approximation to an implicit matrix A. Naturally, the number of matvec queries required to solve this problem depends on the sparsity pattern, which can be encoded as a binary matrix S∈ \0,1\m× n. The query complexity of previous algorithms scales with quantities like the total number of ones in S, its maximum column/row sparsity, or the chromatic number of a its "conflict graph". These quantities are incomparable: for a given S, parameterizing by one might yield lower query complexity than another. In this work, we unify and tighten these prior results by providing a nearly sharp characterization of the matvec query complexity of sparse matrix approximation. Generalizing a definition from graph algorithms, let the degeneracy, degen(S), denote the smallest number k so that, if we iteratively delete all rows and columns of S with ≤ k ones, we are left with an empty matrix. We show that a near-optimal approximation to A with sparsity pattern S can be learned with O(degen(S)) matrix-vector product queries, and Ω(degen(S)) queries are necessary, for any sparsity pattern S. Moreover, unlike prior work based on graph coloring, all of our methods run in polynomial time.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.