作者: Haojin Zhu , Minhui Xue , Benjamin Zi Hao Zhao , Dali Kaafar , Jiahao Yu
DOI:
关键词: Similarity (geometry) 、 Artificial intelligence 、 Covert 、 Invisibility 、 MNIST database 、 Backdoor 、 Trojan 、 Computer science 、 Embedding 、 Steganography
摘要: Deep neural networks (DNNs) have been proven vulnerable to backdoor attacks, where hidden features (patterns) trained a normal model, which is only activated by some specific input (called triggers), trick the model into producing unexpected behavior. In this paper, we create covert and scattered triggers for invisible backdoors, can fool both DNN models human inspection. We apply our backdoors through two state-of-the-art methods of embedding attacks. The first approach on Badnets embeds trigger DNNs steganography. second trojan attack uses types additional regularization terms generate with irregular shape size. use Attack Success Rate Functionality measure performance introduce novel definitions invisibility perception; one conceptualized Perceptual Adversarial Similarity Score (PASS) other Learned Image Patch (LPIPS). show that proposed be fairly effective across various as well four datasets MNIST, CIFAR-10, CIFAR-100, GTSRB, measuring their success rates adversary, functionality users, scores administrators. finally argue attacks effectively thwart detection approaches, such Neural Cleanse TABOR.