作者: Ninghao Liu , Xia Hu , Mengnan Du , Fan Yang , Ruixiang Tang
DOI:
关键词:
摘要: With the widespread use of deep neural networks (DNNs) in high-stake applications, security problem DNN models has received extensive attention. In this paper, we investigate a specific called trojan attack, which aims to attack deployed systems relying on hidden trigger patterns inserted by malicious hackers. We propose training-free approach is different from previous work, trojaned behaviors are injected retraining model poisoned dataset. Specifically, do not change parameters original but insert tiny module (TrojanNet) into target model. The infected with can misclassify inputs label when stamped special triggers. proposed TrojanNet several nice properties including (1) it activates and keeps silent for other signals, (2) model-agnostic could be most DNNs, dramatically expanding its scenarios, (3) mechanism saves massive training efforts comparing conventional methods. experimental results show that inject all labels simultaneously (all-label attack) achieves 100% success rate without affecting accuracy tasks. Experimental analysis further demonstrates state-of-the-art detection algorithms fail detect attack. code available at https URL.