Using file relationships in malware classification

作者: Nikos Karampatziakis , Jack W. Stokes , Anil Thomas , Mady Marinescu

DOI: 10.1007/978-3-642-37300-8_1

关键词:

摘要: Typical malware classification methods analyze unknown files in isolation. However, this ignores valuable relationships between files, such as containment a zip archive, dropping, or downloading. We present new system based on graph induced by file relationships, and, proof of concept, for which we have much available data. However our methodology is general, relying only an initial estimate some the data and propagating information along edges graph. It can thus be applied to other types relationships. show that since malicious are often included multiple containers, system's detection accuracy significantly improved, particularly at low false positive rates main operating points automated classifiers. For example rate 0.2%, negative decreases from 42.1% 15.2%. Finally, highly scalable; basic implementation learn good classifiers large, bipartite including over 719 thousand containers 3.4 million total 16 minutes.

参考文章(26)
Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni, Stefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni, An Introduction to Information Retrieval Springer, Berlin, Heidelberg. pp. 3- 11 ,(2013) , 10.1007/978-3-642-39314-3_1
Aditya P. Mathur, Nwokedi Idika, A Survey of Malware Detection Techniques ,(2007)
A.H. Sung, J. Xu, P. Chavez, S. Mukkamala, Static analyzer of vicious executables (SAVE) annual computer security applications conference. pp. 326- 334 ,(2004) , 10.1109/CSAC.2004.37
Ulrich Bayer, Christopher Kruegel, Engin Kirda, TTAnalyze: A Tool for Analyzing Malware Proceedings of the European Institute for Computer Antivirus Research Annual Conference,2006. ,(2006)
Hinrich Schütze, Christopher D. Manning, Prabhakar Raghavan, Introduction to Information Retrieval ,(2005)
D Krishna Sandeep Reddy, Subrat Kumar Dash, Arun K Pujari, None, New Malicious Code Detection Using Variable Length n-grams Information Systems Security. pp. 276- 288 ,(2006) , 10.1007/11961635_19
Engin Kirda, Richard A. Kemmerer, Christopher Kruegel, Greg Banks, Giovanni Vigna, Behavior-based spyware detection usenix security symposium. pp. 19- ,(2006)
Joris Kinable, Orestis Kostakis, Malware classification based on call graph clustering Journal of Computer Virology and Hacking Techniques. ,vol. 7, pp. 233- 245 ,(2011) , 10.1007/S11416-011-0151-Y
Gene H. Golub, Charles F. Van Loan, Matrix computations (3rd ed.) Johns Hopkins University Press. ,(1996)
Yanfang Ye, Tao Li, Shenghuo Zhu, Weiwei Zhuang, Egemen Tas, Umesh Gupta, Melih Abdulhayoglu, Combining file content and file relations for cloud based malware detection Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11. pp. 222- 230 ,(2011) , 10.1145/2020408.2020448