Scaffoldings and Spines: Organizing High-Dimensional Data Using Cover Trees, Local Principal Component Analysis, and Persistent Homology
Abstract
We propose a flexible and multi-scale method for organizing, visualizing, and understanding datasets sampled from or near stratified spaces. The first part of the algorithm produces a cover tree using adaptive thresholds based on a combination of multi-scale local principal component analysis and topological data analysis. The resulting cover tree nodes consist of points within or near the same stratum of the stratified space. They are then connected to form a scaffolding graph, which is then simplified and collapsed down into a spine graph. From this latter graph the stratified structure becomes apparent. We demonstrate our technique on several synthetic point cloud examples and we use it to understand song structure in musical audio data.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.