When Explainability Meets Adversarial Learning: Detecting Adversarial Examples using SHAP Signatures

作者: Asaf Shabtai , Ron Bitton , Gil Fidel

DOI:

关键词:

摘要: State-of-the-art deep neural networks (DNNs) are highly effective in solving many complex real-world problems. However, these models are vulnerable to adversarial perturbation …

参考文章(35)
Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition Proceedings of the IEEE. ,vol. 86, pp. 2278- 2324 ,(1998) , 10.1109/5.726791
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition computer vision and pattern recognition. pp. 770- 778 ,(2016) , 10.1109/CVPR.2016.90
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Pascal Frossard, DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks computer vision and pattern recognition. pp. 2574- 2582 ,(2016) , 10.1109/CVPR.2016.282
Uri Shaham, Yutaro Yamada, Sahand Negahban, Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization Neurocomputing. ,vol. 307, pp. 195- 204 ,(2018) , 10.1016/J.NEUCOM.2018.04.027
Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin, "Why Should I Trust You?": Explaining the Predictions of Any Classifier knowledge discovery and data mining. pp. 1135- 1144 ,(2016) , 10.1145/2939672.2939778
Gintare Karolina Dziugaite, Zoubin Ghahramani, Daniel M Roy, None, A study of the effect of JPG compression on adversarial images. arXiv: Computer Vision and Pattern Recognition. ,(2016)
Ian Goodfellow, Samy Bengio, Alexey Kurakin, Adversarial Machine Learning at Scale arXiv: Computer Vision and Pattern Recognition. ,(2016)
Nicolas Papernot, Patrick D. McDaniel, Kathrin Grosse, Praveen Manoharan, Michael Backes, On the (Statistical) Detection of Adversarial Examples arXiv: Cryptography and Security. ,(2017)
Saurabh Shintre, Andrew B. Gardner, Ryan R. Curtin, Reuben Feinman, Detecting Adversarial Samples from Artifacts. arXiv: Machine Learning. ,(2017)
Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, Ananthram Swami, None, Practical Black-Box Attacks against Machine Learning computer and communications security. pp. 506- 519 ,(2017) , 10.1145/3052973.3053009