Universal Multi-Party Poisoning Attacks
Abstract
In this work, we demonstrate universal multi-party poisoning attacks that adapt and apply to any multi-party learning process with arbitrary interaction pattern between the parties. More generally, we introduce and study (k,p)-poisoning attacks in which an adversary controls k∈[m] of the parties, and for each corrupted party Pi, the adversary submits some poisoned data T'i on behalf of Pi that is still ``(1-p)-close'' to the correct data Ti (e.g., 1-p fraction of T'i is still honestly generated). We prove that for any ``bad'' property B of the final trained hypothesis h (e.g., h failing on a particular test example or having ``large'' risk) that has an arbitrarily small constant probability of happening without the attack, there always is a (k,p)-poisoning attack that increases the probability of B from μ to by μ1-p · k/m = μ + (p · k/m). Our attack only uses clean labels, and it is online. More generally, we prove that for any bounded function f(x1,…,xn) ∈ [0,1] defined over an n-step random process X = (x1,…,xn), an adversary who can override each of the n blocks with even dependent probability p can increase the expected output by at least (p · Var[f(x)]).