Optimal Principal Component Analysis in Distributed and Streaming Models
Abstract
We study the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix A ∈ Rm × n, a rank parameter k < rank(A), and an accuracy parameter 0 < ε < 1, we want to output an m × k orthonormal matrix U for which || A - U UT A ||F2 (1 + ε ) · || A - Ak||F2, where Ak ∈ Rm × n is the best rank-k approximation to A. This paper provides improved algorithms for distributed PCA and streaming PCA.
0
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.