On Approximating Functions of the Singular Values in a Stream
Abstract
For any real number p > 0, we nearly completely characterize the space complexity of estimating \|A\|pp = Σi=1n σip for n × n matrices A in which each row and each column has O(1) non-zero entries and whose entries are presented one at a time in a data stream model. Here the σi are the singular values of A, and when p ≥ 1, \|A\|pp is the p-th power of the Schatten p-norm. We show that when p is not an even integer, to obtain a (1+ε)-approximation to \|A\|pp with constant probability, any 1-pass algorithm requires n1-g(ε) bits of space, where g(ε) → 0 as ε → 0 and ε > 0 is a constant independent of n. However, when p is an even integer, we give an upper bound of n1-2/p poly(ε-1 n) bits of space, which holds even in the turnstile data stream model. The latter is optimal up to poly(ε-1 n) factors. Our results considerably strengthen lower bounds in previous work for arbitrary (not necessarily sparse) matrices A: the previous best lower bound was ( n) for p∈ (0,1), (n1/p-1/2/ n) for p∈ [1,2) and (n1-2/p) for p∈ (2,∞). We note for p ∈ (2, ∞), while our lower bound for even integers is the same, for other p in this range our lower bound is n1-g(ε), which is considerably stronger than the previous n1-2/p for small enough constant ε > 0. We obtain similar near-linear lower bounds for Ky-Fan norms, SVD entropy, eigenvalue shrinkers, and M-estimators, many of which could have been solvable in logarithmic space prior to our work.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.