Sparse Point-wise Privacy Leakage: Mechanism Design and Fundamental Limits
Abstract
We study an information-theoretic privacy mechanism design problem, where an agent observes useful data Y that is arbitrarily correlated with sensitive data X, and design disclosed data U generated from Y (the agent has no direct access to X). We introduce sparse point-wise privacy leakage, a worst-case privacy criterion that enforces two simultaneous constraints for every disclosed symbol u∈U: (i) u may be correlated with at most N realizations of X, and (ii) the total leakage toward those realizations is bounded. In the high-privacy regime, we use concepts from information geometry to obtain a local quadratic approximation of mutual information which measures utility between U and Y. When the leakage matrix PX|Y is invertible, this approximation reduces the design problem to a sparse quadratic maximization, known as the Rayleigh-quotient problem, with an 0 constraint. We further show that, for the approximated problem, one can without loss of optimality restrict attention to a binary released variable U with a uniform distribution. For small alphabet sizes, the exact sparsity-constrained optimum can be computed via combinatorial support enumeration, which quickly becomes intractable as the dimension grows. For general dimensions, the resulting sparse Rayleigh-quotient maximization is NP-hard and closely related to sparse principal component analysis (PCA). We propose a convex semidefinite programming (SDP) relaxation that is solvable in polynomial time and provides a tractable surrogate for the NP-hard design, together with a simple rounding procedure to recover a feasible leakage direction. We also identify a sparsity threshold beyond which the sparse optimum saturates at the unconstrained spectral value and the SDP relaxation becomes tight.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.