Distributed Differential Privacy in Multi-Armed Bandits

Abstract

We consider the standard K-armed bandit problem under a distributed trust model of differential privacy (DP), which enables to guarantee privacy without a trustworthy server. Under this trust model, previous work largely focus on achieving privacy using a shuffle protocol, where a batch of users data are randomly permuted before sending to a central server. This protocol achieves (ε,δ) or approximate-DP guarantee by sacrificing an additional additive O\!(\!K T(1/δ)ε\!)\! cost in T-step cumulative regret. In contrast, the optimal privacy cost for achieving a stronger (ε,0) or pure-DP guarantee under the widely used central trust model is only \!(\!K Tε\!)\!, where, however, a trusted server is required. In this work, we aim to obtain a pure-DP guarantee under distributed trust model while sacrificing no more regret than that under central trust model. We achieve this by designing a generic bandit algorithm based on successive arm elimination, where privacy is guaranteed by corrupting rewards with an equivalent discrete Laplace noise ensured by a secure computation protocol. We also show that our algorithm, when instantiated with Skellam noise and the secure protocol, ensures R\'enyi differential privacy -- a stronger notion than approximate DP -- under distributed trust model with a privacy cost of O\!(\!K Tε\!)\!.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…