PCA by Optimisation of Symmetric Functions has no Spurious Local Optima

Abstract

Principal Component Analysis (PCA) finds the best linear representation of data, and is an indispensable tool in many learning and inference tasks. Classically, principal components of a dataset are interpreted as the directions that preserve most of its "energy", an interpretation that is theoretically underpinned by the celebrated Eckart-Young-Mirsky Theorem. This paper introduces many other ways of performing PCA, with various geometric interpretations, and proves that the corresponding family of non-convex programs have no spurious local optima, while possessing only strict saddle points. These programs therefore loosely behave like convex problems and can be efficiently solved to global optimality, for example, with certain variants of the stochastic gradient descent. Beyond providing new geometric interpretations and enhancing our theoretical understanding of PCA, our findings might pave the way for entirely new approaches to structured dimensionality reduction, such as sparse PCA and nonnegative matrix factorisation. More specifically, we study an unconstrained formulation of PCA using determinant optimisation that might provide an elegant alternative to the deflating scheme commonly used in sparse PCA.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…