Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons
Abstract
It is well-known that standard neural networks, even with a high classification accuracy, are vulnerable to small ∞-norm bounded adversarial perturbations. Although many attempts have been made, most previous works either can only provide empirical verification of the defense to a particular attack method, or can only develop a certified guarantee of the model robustness in limited scenarios. In this paper, we seek for a new approach to develop a theoretically principled neural network that inherently resists ∞ perturbations. In particular, we design a novel neuron that uses ∞-distance as its basic operation (which we call ∞-dist neuron), and show that any neural network constructed with ∞-dist neurons (called ∞-dist net) is naturally a 1-Lipschitz function with respect to ∞-norm. This directly provides a rigorous guarantee of the certified robustness based on the margin of prediction outputs. We then prove that such networks have enough expressive power to approximate any 1-Lipschitz function with robust generalization guarantee. We further provide a holistic training strategy that can greatly alleviate optimization difficulties. Experimental results show that using ∞-dist nets as basic building blocks, we consistently achieve state-of-the-art performance on commonly used datasets: 93.09% certified accuracy on MNIST (ε=0.3), 35.42% on CIFAR-10 (ε=8/255) and 16.31% on TinyImageNet (ε=1/255).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.