Resolving the Complexity of Some Data Privacy Problems

Ryan Williams

Resolving the Complexity of Some Data Privacy Problems

Abstract

We formally study two methods for data sanitation that have been used extensively in the database community: k-anonymity and l-diversity. We settle several open problems concerning the difficulty of applying these methods optimally, proving both positive and negative results: 1. 2-anonymity is in P. 2. The problem of partitioning the edges of a triangle-free graph into 4-stars (degree-three vertices) is NP-hard. This yields an alternative proof that 3-anonymity is NP-hard even when the database attributes are all binary. 3. 3-anonymity with only 27 attributes per record is MAX SNP-hard. 4. For databases with n rows, k-anonymity is in O(4n poly(n)) time for all k > 1. 5. For databases with n rows and l <= log2c+2 log n attributes over an alphabet of cardinality c = O(1), k-anonymity is in P. Assuming c, l = O(1), k-anonymity is in O(n). 6. 3-diversity with binary attributes is NP-hard, with one sensitive attribute. 7. 2-diversity with binary attributes is NP-hard, with three sensitive attributes.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…