Marginality: a numerical mapping for enhanced treatment of nominal and hierarchical attributes

Abstract

The purpose of statistical disclosure control (SDC) of microdata, a.k.a. data anonymization or privacy-preserving data mining, is to publish data sets containing the answers of individual respondents in such a way that the respondents corresponding to the released records cannot be re-identified and the released data are analytically useful. SDC methods are either based on masking the original data, generating synthetic versions of them or creating hybrid versions by combining original and synthetic data. The choice of SDC methods for categorical data, especially nominal data, is much smaller than the choice of methods for numerical data. We mitigate this problem by introducing a numerical mapping for hierarchical nominal data which allows computing means, variances and covariances on them.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…