Optimal Principal Component Analysis in Distributed and Streaming Models

Abstract

We study the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix A ∈ Rm × n, a rank parameter k < rank(A), and an accuracy parameter 0 < ε < 1, we want to output an m × k orthonormal matrix U for which || A - U UT A ||F2 (1 + ε ) · || A - Ak||F2, where Ak ∈ Rm × n is the best rank-k approximation to A. This paper provides improved algorithms for distributed PCA and streaming PCA.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…