Efficient L1-Norm Principal-Component Analysis via Bit Flipping

Abstract

It was shown recently that the K L1-norm principal components (L1-PCs) of a real-valued data matrix X ∈ RD × N (N data samples of D dimensions) can be exactly calculated with cost O(2NK) or, when advantageous, O(NdK - K + 1) where d=rank( X), K<d [1],[2]. In applications where X is large (e.g., "big" data of large N and/or "heavy" data of large d), these costs are prohibitive. In this work, we present a novel suboptimal algorithm for the calculation of the K < d L1-PCs of X of cost O(ND min \ N,D\ + N2(K4 + dK2) + dNK3), which is comparable to that of standard (L2-norm) PC analysis. Our theoretical and experimental studies show that the proposed algorithm calculates the exact optimal L1-PCs with high frequency and achieves higher value in the L1-PC optimization metric than any known alternative algorithm of comparable computational cost. The superiority of the calculated L1-PCs over standard L2-PCs (singular vectors) in characterizing potentially faulty data/measurements is demonstrated with experiments on data dimensionality reduction and disease diagnosis from genomic data.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…