Patch of Invisibility: Naturalistic Physical Black-Box Adversarial Attacks on Object Detectors
Abstract
Adversarial attacks on deep learning models have received increased attention in recent years. Work in this area has mostly focused on gradient-based techniques, so-called 'white-box' attacks, where the attacker has access to the targeted model's internal parameters; such an assumption is usually untenable in the real world. Additionally, some attacks use the entire pixel space to fool a given model, which is neither practical nor physical. To accommodate these problems we propose the BBNP algorithm (Black-Box Naturalistic Patch): a direct, black-box, naturalistic, gradient-free method that uses the learned image manifold of a pretrained, generative adversarial network (GAN) to generate naturalistic adversarial patches for object detectors. This method performs model-agnostic black-box naturalistic attacks on object detection models by relying solely on the outputs of the model. Comparing our approach against five models, five black-box and two white-box attacks, we show that our proposed method achieves state-of-the-art results, outperforming all other tested black-box approaches.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.