Cascading Bandit under Differential Privacy
Abstract
This paper studies differential privacy (DP) and local differential privacy (LDP) in cascading bandits. Under DP, we propose an algorithm which guarantees ε-indistinguishability and a regret of O(( Tε)1+) for an arbitrarily small . This is a significant improvement from the previous work of O(3 Tε) regret. Under (ε,δ)-LDP, we relax the K2 dependence through the tradeoff between privacy budget ε and error probability δ, and obtain a regret of O(K (1/δ) Tε2), where K is the size of the arm subset. This result holds for both Gaussian mechanism and Laplace mechanism by analyses on the composition. Our results extend to combinatorial semi-bandit. We show respective lower bounds for DP and LDP cascading bandits. Extensive experiments corroborate our theoretic findings.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.