Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)
Abstract
We propose the scheme that mitigates the adversarial perturbation ε on the adversarial example Xadv (= X ε, X is a benign sample) by subtracting the estimated perturbation ε from X + ε and adding ε to X - ε. The estimated perturbation ε comes from the difference between Xadv and its moving-averaged outcome Wavg*Xadv where Wavg is N × N moving average kernel that all the coefficients are one. Usually, the adjacent samples of an image are close to each other such that we can let X ≈ Wavg*X (naming this relation after X-MAS[X minus Moving Averaged Samples]). By doing that, we can make the estimated perturbation ε falls within the range of ε. The scheme is also extended to do the multi-level mitigation by configuring the mitigated adversarial example Xadv ε as a new adversarial example to be mitigated. The multi-level mitigation gets Xadv closer to X with a smaller (i.e. mitigated) perturbation than original unmitigated perturbation by setting the moving averaged adversarial sample Wavg * Xadv (which has the smaller perturbation than Xadv if X ≈ Wavg*X) as the boundary condition that the multi-level mitigation cannot cross over (i.e. decreasing ε cannot go below and increasing ε cannot go beyond). With the multi-level mitigation, we can get high prediction accuracies even in the adversarial example having a large perturbation (i.e. ε > 16). The proposed scheme is evaluated with adversarial examples crafted by the FGSM (Fast Gradient Sign Method) based attacks on ResNet-50 trained with ImageNet dataset.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.