Fuzzy simplicial sets and their application to geometric data analysis

Abstract

In this article, we expand upon the concepts introduced by David Spivak about the relationship between the category UM of uber metric spaces and the category sFuz of fuzzy simplicial sets. We show that fuzzy simplicial sets can be regarded as natural combinatorial generalizations of metric relations. Furthermore, we take inspiration from UMAP to apply the theory to manifold learning, dimension reduction and data visualization, while refining some of their constructions. We generalize the adjunction between UM and sFuz, derive an explicit description of colimits in UM, and show that UM can be embedded into sFuz. Furthermore, we prove analogous results for the category of extended-pseudo metric spaces EPMet. We also provide rigorous definitions of functors that make it possible to recursively merge sets of fuzzy simplicial sets and provide a description of the adjunctions between the category of truncated fuzzy simplicial sets and sFuz, which we relate to persistent homology. Combining those constructions, we can show a surprising connection between the well-known dimension reduction methods UMAP and Isomap and derive an alternative algorithm, which we call IsUMap, that combines some of the strengths of both methods. Additionally, we developed a new embedding method that allows to preserve clusters detected in the original metric space that we construct from the data. The visualization of the optimization process gives the user information, both about the inner-cluster distributions in the original metric space and their inter-cluster relations. We compare our new method with UMAP, Isomap and t-SNE on a series of low- and high-dimensional datasets and provide explanations for observed differences and improvements.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…