Priv'IT: Private and Sample Efficient Identity Testing
Abstract
We develop differentially private hypothesis testing methods for the small sample regime. Given a sample D from a categorical distribution p over some domain , an explicitly described distribution q over , some privacy parameter , accuracy parameter α, and requirements β I and β II for the type I and type II errors of our test, the goal is to distinguish between p=q and dTV(p,q) ≥ α. We provide theoretical bounds for the sample size | D| so that our method both satisfies (,0)-differential privacy, and guarantees β I and β II type I and type II errors. We show that differential privacy may come for free in some regimes of parameters, and we always beat the sample complexity resulting from running the 2-test with noisy counts, or standard approaches such as repetition for endowing non-private 2-style statistics with differential privacy guarantees. We experimentally compare the sample complexity of our method to that of recently proposed methods for private hypothesis testing.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.