作者: Fahad Bin Muhaya , Muhammad Khurram Khan , Yang Xiang
DOI: 10.1109/DASC.2011.47
关键词: Hierarchical hidden Markov model 、 Machine learning 、 Network security 、 Hidden Markov model 、 Control flow graph 、 Computer science 、 Finite-state machine 、 Tree (data structure) 、 Malware 、 Code generation 、 Data mining 、 Artificial intelligence
摘要: Binary signatures have been widely used to detect malicious software on the current Internet. However, this approach is unable achieve accurate identification of polymorphic malware variants, which can be easily generated by authors using code generation engines. Code engines randomly produce varying sequences but perform same desired functions. Previous research flow graph and signature tree identify families. The key difficulty previous precisely defined state machine models from variants. This paper proposes a novel approach, Hierarchical Hidden Markov Model (HHMM), provide inductive inference family. model capture features self-similar hierarchical structure family sequences. To demonstrate effectiveness efficiency we evaluate it with real samples. Using more than 15,000 malware, find our high true positives, low false computational cost.