Generating Adversarial Examples with Adversarial Networks

Deep neural networks (DNNs) have been foundto be vulnerable to adversarial examples resultingfrom adding small-magnitude perturbations to in-puts. Such adversarial examples can mislead DNNsto produce adversary-selected results. Different at-tack strategies have been proposed to generate ad-versarial examples, but how to produce them withhigh perceptual quality and more efficiently re-quires more research efforts. In this paper, wepropose AdvGAN to generate adversarial exam-ples with generative adversarial networks (GANs),which can learn and approximate the distributionof original instances. For AdvGAN, once the gen-erator is trained, it can generate perturbations effi-ciently for any instance, so as to potentially acceler-ate adversarial training as defenses. We apply Adv-GAN in both semi-whitebox and black-box attacksettings. In semi-whitebox attacks, there is no needto access the original target model after the gener-ator is trained, in contrast to traditional white-boxattacks. In black-box attacks, we dynamically traina distilled model for the black-box model and op-timize the generator accordingly. Adversarial ex-amples generated by AdvGAN on different targetmodels have high attack success rate under state-of-the-art defenses compared to other attacks. Ourattack has placed the first with 92.76% accuracy ona public MNIST black-box attack challenge. Read More

#assurance, #neural-networks