Attacking the Madry Defense Model with L1-based Adversarial Examples

Abstract

The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal L∞ distortion ε = 0.3. This discourages the use of attacks which are not optimized on the L∞ distortion metric. Our experimental results demonstrate that by relaxing the L∞ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average L∞ distortion, have minimal visual distortion. These results call into question the use of L∞ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…