Perceptual Evaluation of Adversarial Attacks for CNN-based Image Classification
Abstract
Deep neural networks (DNNs) have recently achieved state-of-the-art performance and provide significant progress in many machine learning tasks, such as image classification, speech processing, natural language processing, etc. However, recent studies have shown that DNNs are vulnerable to adversarial attacks. For instance, in the image classification domain, adding small imperceptible perturbations to the input image is sufficient to fool the DNN and to cause misclassification. The perturbed image, called adversarial example, should be visually as close as possible to the original image. However, all the works proposed in the literature for generating adversarial examples have used the Lp norms (L0, L2 and L∞) as distance metrics to quantify the similarity between the original image and the adversarial example. Nonetheless, the Lp norms do not correlate with human judgment, making them not suitable to reliably assess the perceptual similarity/fidelity of adversarial examples. In this paper, we present a database for visual fidelity assessment of adversarial examples. We describe the creation of the database and evaluate the performance of fifteen state-of-the-art full-reference (FR) image fidelity assessment metrics that could substitute Lp norms. The database as well as subjective scores are publicly available to help designing new metrics for adversarial examples and to facilitate future research works.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.