Delta-Closure Structure for Studying Data Distribution

Abstract

In this paper, we revisit pattern mining and study the distribution underlying a binary dataset thanks to the closure structure which is based on passkeys, i.e., minimum generators in equivalence classes robust to noise. We introduce -closedness, a generalization of the closure operator, where measures how a closed set differs from its upper neighbors in the partial order induced by closure. A -class of equivalence includes minimum and maximum elements and allows us to characterize the distribution underlying the data. Moreover, the set of -classes of equivalence can be partitioned into the so-called -closure structure. In particular, a -class of equivalence with a high level demonstrates correlations among many attributes, which are supported by more observations when is large. In the experiments, we study the -closure structure of several real-world datasets and show that this structure is very stable for large and does not substantially depend on the data sampling used for the analysis.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…