Improving malware detection by applying multi-inducer ensemble

作者: Eitan Menahem , Asaf Shabtai , Lior Rokach , Yuval Elovici

DOI: 10.1016/J.CSDA.2008.10.015

关键词: Decision theoryTask (computing)Data miningMachine learningSoftwareExecution timeDecision treeNaive Bayes classifierMalwareArtificial intelligenceExploitComputer science

摘要: Detection of malicious software (malware) using machine learning methods has been explored extensively to enable fast detection new released malware. The performance these classifiers depends on the induction algorithms being used. In order benefit from multiple different classifiers, and exploit their strengths we suggest an ensemble method that will combine results individual into one final result achieve overall higher accuracy. this paper evaluate several combining five base inducers (C4.5 Decision Tree, Naive Bayes, KNN, VFI OneR) malware datasets. main goal is find best for task detecting files in terms accuracy, AUC Execution time.

参考文章(30)
David M. Chess, John F. Morar, William C. Arnold, Steve R. White, Morton Swimmer, Edward J. Pring, Anatomy of a Commercial-Grade Immune System ,(1999)
David H. Wolpert, Original Contribution: Stacked generalization Neural Networks. ,vol. 5, pp. 241- 259 ,(1992) , 10.1016/S0893-6080(05)80023-1
Peyman Kabiri, Ali A. Ghorbani, RESEARCH ON INTRUSION DETECTION AND RESPONSE: A SURVEY International Journal of Network Security. ,vol. 1, pp. 84- 102 ,(2005) , 10.6633/IJNS.200509.1(2).05
Boyun Zhang, Jianping Yin, Jingbo Hao, Dingxing Zhang, Shulin Wang, Malicious codes detection based on ensemble learning autonomic and trusted computing. pp. 468- 477 ,(2007) , 10.1007/978-3-540-73547-2_48
Janez Demšar, Statistical Comparisons of Classifiers over Multiple Data Sets Journal of Machine Learning Research. ,vol. 7, pp. 1- 30 ,(2006)
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
George H. John, Pat Langley, Estimating continuous distributions in Bayesian classifiers uncertainty in artificial intelligence. pp. 338- 345 ,(1995)
Peter Clark, Robin Boswell, Rule induction with CN2: Some recent improvements Lecture Notes in Computer Science. pp. 151- 163 ,(1991) , 10.1007/BFB0017011
Gülşen Demiröz, H. Altay Güvenir, Classification by Voting Feature Intervals european conference on machine learning. pp. 85- 92 ,(1997) , 10.1007/3-540-62858-4_74