Comparison of feature selection and classification algorithms in identifying malicious executables

作者: D. Michael Cai , Maya Gokhale , James Theiler

DOI: 10.1016/J.CSDA.2006.09.005

关键词: Naive Bayes classifierStatistical classificationByteComputer scienceClassifier (UML)Pattern recognitionEmail filteringMachine learningOverfittingArtificial intelligenceFeature selectionSupport vector machine

摘要: Malicious executables, often spread as email attachments, impose serious security threats to computer systems and associated networks. We investigated the use of byte sequence frequencies a way automatically distinguish malicious from benign executables without actually executing them. In series experiments, we compared classification accuracies over seven feature selection methods, four algorithms, variable lengths. found that single-byte patterns provided surprisingly reliable features separate benign. Between classifiers overall performance models depended more on choice classifier than method selection. Support vector machine (SVM) were be superior in terms prediction accuracy, training time, aversion overfitting.

参考文章(30)
Gerald Tesauro, William Arnold, AUTOMATICALLY GENERATED WIN32 HEURISTIC VIRUS DETECTION ,(2000)
Yiming Yang, Seán Slattery, Rayid Ghani, A Study of Approaches to Hypertext Categorization intelligent information systems. ,vol. 18, pp. 219- 241 ,(2002) , 10.1023/A:1013685612819
John C. Platt, Fast training of support vector machines using sequential minimal optimization Advances in kernel methods. pp. 185- 208 ,(1999)
Kamal Nigam, Andrew McCallum, A comparison of event models for naive bayes text classification national conference on artificial intelligence. pp. 41- 48 ,(1998)
Thorsten Joachims, Making large scale SVM learning practical Technical reports. ,(1999) , 10.17877/DE290R-14262
BSCH OLKOPF, C Burges, A Smola, Advances in kernel methods: support vector learning international conference on neural information processing. ,(1999) , 10.5555/299094
Erez Zadok, Eleazar Eskin, Salvatore J. Stolfo, Manasi Bhattacharyya, Matthew G. Schultz, MEF: Malicious Email Filter - A UNIX Mail Filter That Detects Malicious Windows Executables usenix annual technical conference. pp. 245- 252 ,(2001) , 10.7916/D8W38329
Wenke Lee, S.J. Stolfo, K.W. Mok, A data mining framework for building intrusion detection models ieee symposium on security and privacy. pp. 120- 132 ,(1999) , 10.1109/SECPRI.1999.766909
Fred Cohen, A cryptographic checksum for integrity protection Computers & Security. ,vol. 6, pp. 505- 510 ,(1987) , 10.1016/0167-4048(87)90031-9
Robert Moskovitch, Yuval Elovici, Lior Rokach, Detection of unknown computer worms based on behavioral classification of the host Computational Statistics & Data Analysis. ,vol. 52, pp. 4544- 4566 ,(2008) , 10.1016/J.CSDA.2008.01.028