Polymorphic Malware Detection Using Hierarchical Hidden Markov Model

作者: Fahad Bin Muhaya , Muhammad Khurram Khan , Yang Xiang

DOI: 10.1109/DASC.2011.47

关键词: Hierarchical hidden Markov modelMachine learningNetwork securityHidden Markov modelControl flow graphComputer scienceFinite-state machineTree (data structure)MalwareCode generationData miningArtificial intelligence

摘要: Binary signatures have been widely used to detect malicious software on the current Internet. However, this approach is unable achieve accurate identification of polymorphic malware variants, which can be easily generated by authors using code generation engines. Code engines randomly produce varying sequences but perform same desired functions. Previous research flow graph and signature tree identify families. The key difficulty previous precisely defined state machine models from variants. This paper proposes a novel approach, Hierarchical Hidden Markov Model (HHMM), provide inductive inference family. model capture features self-similar hierarchical structure family sequences. To demonstrate effectiveness efficiency we evaluate it with real samples. Using more than 15,000 malware, find our high true positives, low false computational cost.

参考文章(10)
Thomas Raffetseder, Christopher Kruegel, Engin Kirda, Detecting System Emulators Lecture Notes in Computer Science. pp. 1- 18 ,(2007) , 10.1007/978-3-540-75496-1_1
Shai Fine, Yoram Singer, Naftali Tishby, The Hierarchical Hidden Markov Model: Analysis and Applications Machine Learning. ,vol. 32, pp. 41- 62 ,(1998) , 10.1023/A:1007469218079
Silvio Cesare, Yang Xiang, A Fast Flowgraph Based Classification System for Packed and Polymorphic Malware on the Endhost advanced information networking and applications. pp. 721- 728 ,(2010) , 10.1109/AINA.2010.121
A. Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm IEEE Transactions on Information Theory. ,vol. 13, pp. 260- 269 ,(1967) , 10.1109/TIT.1967.1054010
Leonard E. Baum, Ted Petrie, Statistical Inference for Probabilistic Functions of Finite State Markov Chains Annals of Mathematical Statistics. ,vol. 37, pp. 1554- 1563 ,(1966) , 10.1214/AOMS/1177699147
J. Newsome, B. Karp, D. Song, Polygraph: automatically generating signatures for polymorphic worms ieee symposium on security and privacy. pp. 226- 241 ,(2005) , 10.1109/SP.2005.15
Md. Enamul. Karim, Andrew Walenstein, Arun Lakhotia, Laxmi Parida, Malware Phylogeny Generation using Permutations of Code Journal in Computer Virology. ,vol. 1, pp. 13- 23 ,(2005) , 10.1007/S11416-005-0002-9
Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao, B. Chavez, Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience ieee symposium on security and privacy. pp. 32- 47 ,(2006) , 10.1109/SP.2006.18
Min Gyung Kang, Pongsin Poosankam, Heng Yin, Renovo Proceedings of the 2007 ACM workshop on Recurring malcode - WORM '07. pp. 46- 53 ,(2007) , 10.1145/1314389.1314399
Yong Tang, Bin Xiao, Xicheng Lu, Signature Tree Generation for Polymorphic Worms IEEE Transactions on Computers. ,vol. 60, pp. 565- 579 ,(2011) , 10.1109/TC.2010.130