An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique

作者: S. L. Shiva Darshan , C. D. Jaidhar

DOI: 10.1007/S13042-019-00978-7

关键词: Classifier (UML)Random forestSource codeComputational intelligenceData miningPortable ExecutableDecision treeComputer scienceMalwareFeature selection

摘要: The emergence of advanced malware is a serious threat to information security. A prominent technique that identifies sophisticated should consider the runtime behaviour source file detect malicious intent. Although behaviour-based detection substantial improvement over traditional signature-based technique, current employs code obfuscation techniques elude detection. This paper presents Hybrid Features-based system (HFMDS) integrates static and dynamic features portable executable (PE) files discern malware. HFMDS trained with advised by filter-based feature selection (FST). ability proposed has evaluated random forest (RF) classifier considering two different datasets consist real-world Windows samples. In-depth analysis carried out determine optimal number decision trees (DTs) required RF achieve consistent accuracy. Besides, four popular FSTs performance also analyzed which FST recommends best features. From experimental analysis, we can infer increasing DTs after 160 within does not make significant difference in attaining better

参考文章(62)
Yong Qiao, Yuexiang Yang, Jie He, Chuan Tang, Zhixue Liu, CBM: Free, Automatic Malware Analysis Framework Using API Call Sequences Advances in Intelligent Systems and Computing. pp. 225- 236 ,(2014) , 10.1007/978-3-642-37832-4_21
P. Vinod, V. Laxmi, M. S. Gaur, Scattered Feature Space for Malware Analysis Advances in Computing and Communications. pp. 562- 571 ,(2011) , 10.1007/978-3-642-22709-7_55
Gokhan Gulgezen, Zehra Cataltepe, Lei Yu, Stable and accurate feature selection european conference on machine learning. pp. 455- 468 ,(2009) , 10.1007/978-3-642-04180-8_47
Masoud Narouei, Mansour Ahmadi, Giorgio Giacinto, Hassan Takabi, Ashkan Sami, DLLMiner: structural mining for malware detection Security and Communication Networks. ,vol. 8, pp. 3311- 3322 ,(2015) , 10.1002/SEC.1255
Eibe Frank, Mark Hall, Geoffrey Holmes, Richard Kirkby, Bernhard Pfahringer, Ian H. Witten, Len Trigg, Weka-A Machine Learning Workbench for Data Mining The Data Mining and Knowledge Discovery Handbook. pp. 1269- 1277 ,(2009) , 10.1007/978-0-387-09823-4_66
Naoto Kawaguchi, Kazumasa Omote, Malware Function Classification Using APIs in Initial Behavior 2015 10th Asia Joint Conference on Information Security. pp. 138- 144 ,(2015) , 10.1109/ASIAJCIS.2015.15
Omid E. David, Nathan S. Netanyahu, DeepSign: Deep learning for automatic malware signature generation and classification international joint conference on neural network. pp. 1- 8 ,(2015) , 10.1109/IJCNN.2015.7280815
Thais Mayumi Oshiro, Pedro Santoro Perez, José Augusto Baranauskas, How many trees in a random forest machine learning and data mining in pattern recognition. pp. 154- 168 ,(2012) , 10.1007/978-3-642-31537-4_13
Konrad Rieck, Philipp Trinius, Carsten Willems, Thorsten Holz, Automatic analysis of malware behavior using machine learning Journal of Computer Security. ,vol. 19, pp. 639- 668 ,(2011) , 10.3233/JCS-2010-0410
Aziz Mohaisen, Omar Alrawi, Manar Mohaisen, None, AMAL: High-fidelity, behavior-based automated malware analysis and classification Computers & Security. ,vol. 52, pp. 251- 266 ,(2015) , 10.1016/J.COSE.2015.04.001