Product Formalisms for Measures on Spaces with Binary Tree Structures: Representation, Visualization, and Multiscale Noise

Abstract

In this paper we present a theoretical foundation for a representation of a data set as a measure in a very large hierarchically parametrized family of positive measures, whose parameters can be computed explicitly (rather than estimated by optimization), and illustrate its applicability to a wide range of data types. The pre-processing step then consists of representing data sets as simple measures. The theoretical foundation consists of a dyadic product formula representation lemma, a visualization theorem. We also define an additive multiscale noise model which can be used to sample from dyadic measures and a more general multiplicative multiscale noise model which can be used to perturb continuous functions, Borel measures, and dyadic measures. The first two results are based on theorems. The representation uses the very simple concept of a dyadic tree, and hence is widely applicable, easily understood, and easily computed. Since the data sample is represented as a measure, subsequent analysis can exploit statistical and measure theoretic concepts and theories. Because the representation uses the very simple concept of a dyadic tree defined on the universe of a data set and the parameters are simply and explicitly computable and easily interpretable and visualizable, we hope that this approach will be broadly useful to mathematicians, statisticians, and computer scientists who are intrigued by or involved in data science including its mathematical foundations.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…