Optimal Schemes for Discrete Distribution Estimation under Locally Differential Privacy
Abstract
We consider the minimax estimation problem of a discrete distribution with support size k under privacy constraints. A privatization scheme is applied to each raw sample independently, and we need to estimate the distribution of the raw samples from the privatized samples. A positive number ε measures the privacy level of a privatization scheme. For a given ε, we consider the problem of constructing optimal privatization schemes with ε-privacy level, i.e., schemes that minimize the expected estimation loss for the worst-case distribution. Two schemes in the literature provide order optimal performance in the high privacy regime where ε is very close to 0, and in the low privacy regime where eε≈ k, respectively. In this paper, we propose a new family of schemes which substantially improve the performance of the existing schemes in the medium privacy regime when 1 eε k. More concretely, we prove that when 3.8 < ε <(k/9) , our schemes reduce the expected estimation loss by 50\% under 22 metric and by 30\% under 1 metric over the existing schemes. We also prove a lower bound for the region eε k, which implies that our schemes are order optimal in this regime.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.