MML Probabilistic Principal Component Analysis
Abstract
Principal component analysis (PCA) is perhaps the most widely used method for data dimensionality reduction. A key question in PCA is deciding how many factors to retain. This manuscript describes a new approach to automatically selecting the number of principal components based on the Bayesian minimum message length method of inductive inference. We derive a new estimate of the isotropic residual variance and demonstrate that it improves on the usual maximum likelihood approach. We also discuss extending this approach to finite mixture models of principal component analyzers.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.