DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule
Abstract
SQL is the de facto interface for exploratory data analysis; however, releasing exact query results can expose sensitive information through membership or attribute inference attacks. Differential privacy (DP) provides rigorous privacy guarantees, but in practice, DP alone may not satisfy governance requirements such as the minimum frequency rule, which requires each released group (cell) to include contributions from at least k distinct individuals. In this paper, we present DPSQL+, a privacy-preserving SQL library that simultaneously enforces user-level (,δ)-DP and the minimum frequency rule. DPSQL+ adopts a modular architecture consisting of: (i) a Validator that statically restricts queries to a DP-safe subset of SQL; (ii) an Accountant that consistently tracks cumulative privacy loss across multiple queries; and (iii) a Backend that interfaces with various database engines, ensuring portability and extensibility. Experiments on the TPC-H benchmark demonstrate that DPSQL+ achieves practical accuracy across a wide range of analytical workloads -- from basic aggregates to quadratic statistics and join operations -- and allows substantially more queries under a fixed global privacy budget than prior libraries in our evaluation.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.