Optimal Bound for PCA with Outliers using Higher-Degree Voronoi Diagrams

Abstract

In this paper, we introduce new algorithms for Principal Component Analysis (PCA) with outliers. Utilizing techniques from computational geometry, specifically higher-degree Voronoi diagrams, we navigate to the optimal subspace for PCA even in the presence of outliers. This approach achieves an optimal solution with a time complexity of nd+O(1)poly(n,d). Additionally, we present a randomized algorithm with a complexity of 2O(r(d-r)) × poly(n, d). This algorithm samples subspaces characterized in terms of a Grassmannian manifold. By employing such sampling method, we ensure a high likelihood of capturing the optimal subspace, with the success probability (1 - δ)T. Where δ represents the probability that a sampled subspace does not contain the optimal solution, and T is the number of subspaces sampled, proportional to 2r(d-r). Our use of higher-degree Voronoi diagrams and Grassmannian based sampling offers a clearer conceptual pathway and practical advantages, particularly in handling large datasets or higher-dimensional settings.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…