One-Pixel Attack for Fooling Deep Neural Networks
Jiawei Su et al. (PDF, via Thomas Lahore):
Recent research has revealed that the output of Deep neural networks(DNN) is not continuous and very sensitive to tiny perturbation on the input vectors and accordingly several methods have been proposed for crafting effective perturbation against the networks. In this paper, we propose a novel method for optically calculating extremely small adversarial perturbation (few-pixels attack), based on differential evolution. It requires much less adversarial information and works with a broader classes of DNN models. The results show that 73.8% of the test images can be crafted to adversarial images with modification just on one pixel with 98.7% confidence on average. In addition, it is known that investigating the robustness problem of DNN can bring critical clues for understanding the geometrical features of the DNN decision map in high dimensional input space. The results of conducting few-pixels attack contribute quantitative measurements and analysis to the geometrical understanding from a different perspective compared to previous works.