Potential adversarial samples for white-box attacks

Abstract

Deep convolutional neural networks can be highly vulnerable to small perturbations of their inputs, potentially a major issue or limitation on system robustness when using deep networks as classifiers. In this paper we propose a low-cost method to explore marginal sample data near trained classifier decision boundaries, thus identifying potential adversarial samples. By finding such adversarial samples it is possible to reduce the search space of adversarial attack algorithms while keeping a reasonable successful perturbation rate. In our developed strategy, the potential adversarial samples represent only 61% of the test data, but in fact cover more than 82% of the adversarial samples produced by iFGSM and 92% of the adversarial samples successfully perturbed by DeepFool on CIFAR10.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…