Similarity Based Binary Backdoor Detection via Attributed Control Flow Graph

作者: Tianchen Zhang , Haixiang Wang , Huan Ying , Jiyuan Li

DOI: 10.1109/ITNEC48623.2020.9085069

关键词: EmbeddingControl flow graphBinary numberArtificial neural networkBinary functionBackdoorComputer scienceSimilarity (geometry)FirmwareAlgorithm

摘要: The problem of backdoor detection aims at detecting whether binary functions coming from different embedded end devices are similar to some known backdoors. Existing approaches using signature or approximate graph-matching algorithms. They all meet a that they hard adapt new task. Also, algorithms inevitably slow and sometimes inaccurate. To address these issues, in this work, we propose novel neural network-based approach compute the embedding, i.e., numeric vector, based on control flow graph each function, then can be done efficiently by measuring distance between embeddings for firmware typical functions. We evaluate method achieve F-1 score 0.75 detect significantly faster than method.

参考文章(12)
Harish Karnick, Sumit Bhagwani, Shrutiranjan Satapathy, Semantic textual similarity using maximal weighted bipartite graph matching joint conference on lexical and computational semantics. pp. 579- 585 ,(2012)
Shanhu Shang, Ning Zheng, Jian Xu, Ming Xu, Haiping Zhang, Detecting malware variants via function-call graph similarity international conference on malicious and unwanted software. pp. 113- 120 ,(2010) , 10.1109/MALWARE.2010.5665787
Boojoong Kang, Hye Seon Kim, Taeguen Kim, Heejun Kwon, Eul Gyu Im, Fast malware family detection method using control flow graphs research in applied computation symposium. pp. 287- 292 ,(2011) , 10.1145/2103380.2103439
Vehbi C. Gungor, Dilan Sahin, Taskin Kocak, Salih Ergut, Concettina Buccella, Carlo Cecati, Gerhard P. Hancke, Smart Grid Technologies: Communication Technologies and Standards IEEE Transactions on Industrial Informatics. ,vol. 7, pp. 529- 539 ,(2011) , 10.1109/TII.2011.2166794
Michel Chilowicz, Etienne Duris, Gilles Roussel, Syntax tree fingerprinting for source code similarity detection international conference on program comprehension. pp. 243- 247 ,(2009) , 10.1109/ICPC.2009.5090050
Martin Sundermeyer, Ralf Schlüter, Hermann Ney, LSTM Neural Networks for Language Modeling. conference of the international speech communication association. pp. 194- 197 ,(2012)
Sylvain Ruhault, SoK: Security Models for Pseudo-Random Number Generators IACR Cryptology ePrint Archive. ,vol. 2017, pp. 506- 544 ,(2017) , 10.13154/TOSC.V2017.I1.506-544
Wei Xie, Yikun Jiang, Yong Tang, Ning Ding, Yuanming Gao, Vulnerability Detection in IoT Firmware: A Survey 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS). pp. 769- 772 ,(2017) , 10.1109/ICPADS.2017.00104
Bingchang Liu, Wei Huo, Chao Zhang, Wenchao Li, Feng Li, Aihua Piao, Wei Zou, $\alpha$ Diff: Cross-Version Binary Code Similarity Detection with DNN automated software engineering. pp. 667- 678 ,(2018) , 10.1145/3238147.3238199