Low-Rank Approximation with 1/ε1/3 Matrix-Vector Products
Abstract
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-p norm. Here, given access to a matrix A through matrix-vector products, an accuracy parameter ε, and a target rank k, the goal is to find a rank-k matrix Z with orthonormal columns such that \| A(I -ZZ)\|Sp ≤ (1+ε)U U = Ik \|A(I - U U)\|Sp, where \|M\|Sp denotes the p norm of the the singular values of M. For the special cases of p=2 (Frobenius norm) and p = ∞ (Spectral norm), Musco and Musco (NeurIPS 2015) obtained an algorithm based on Krylov methods that uses O(k/ε) matrix-vector products, improving on the na\"ive O(k/ε) dependence obtainable by the power method, where O suppresses poly((dk/ε)) factors. Our main result is an algorithm that uses only O(kp1/6/ε1/3) matrix-vector products, and works for all p ≥ 1. For p = 2 our bound improves the previous O(k/ε1/2) bound to O(k/ε1/3). Since the Schatten-p and Schatten-∞ norms are the same up to a (1+ ε)-factor when p ≥ ( d)/ε, our bound recovers the result of Musco and Musco for p = ∞. Further, we prove a matrix-vector query lower bound of (1/ε1/3) for any fixed constant p ≥ 1, showing that surprisingly (1/ε1/3) is the optimal complexity for constant~k. To obtain our results, we introduce several new techniques, including optimizing over multiple Krylov subspaces simultaneously, and pinching inequalities for partitioned operators. Our lower bound for p ∈ [1,2] uses the Araki-Lieb-Thirring trace inequality, whereas for p>2, we appeal to a norm-compression inequality for aligned partitioned operators.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.