Combining file content and file relations for cloud based malware detection

作者: Yanfang Ye , Tao Li , Shenghuo Zhu , Weiwei Zhuang , Egemen Tas

DOI: 10.1145/2020408.2020448

关键词:

摘要: Due to their damages Internet security, malware (such as virus, worms, trojans, spyware, backdoors, and rootkits) detection has caught the attention not only of anti-malware industry but also researchers for decades. Resting on analysis file contents extracted from samples, like Application Programming Interface (API) calls, instruction sequences, binary strings, data mining methods such Naive Bayes Support Vector Machines have been used detection. However, besides contents, relations among a "Downloader" is always associated with many Trojans, can provide invaluable information about properties samples. In this paper, we study how be improve results develop verdict system (named "Valkyrie") building semi-parametric classifier model combine content together To best our knowledge, first work using both A comprehensive experimental large collection PE files obtained clients products Comodo Security Solutions Incorporation performed compare various approaches. Promising demonstrate that accuracy efficiency Valkyrie outperform other popular software tools Kaspersky AntiVirus McAfee VirusScan, well alternative based systems.

参考文章(33)
Pranam Kolari, Tim Finin, Anupam Joshi, SVMs for the Blogosphere: Blog Identification and Splog Detection national conference on artificial intelligence. pp. 92- 99 ,(2006)
Éric Filiol, Computer Viruses: from Theory to Applications Springer. ,(2005)
A.H. Sung, J. Xu, P. Chavez, S. Mukkamala, Static analyzer of vicious executables (SAVE) annual computer security applications conference. pp. 326- 334 ,(2004) , 10.1109/CSAC.2004.37
Yiming Yang, Seán Slattery, Rayid Ghani, A Study of Approaches to Hypertext Categorization intelligent information systems. ,vol. 18, pp. 219- 241 ,(2002) , 10.1023/A:1013685612819
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Michelle Fisher, Richard Everson, When Are Links Useful? Experiments in Text Classification Lecture Notes in Computer Science. pp. 41- 56 ,(2003) , 10.1007/3-540-36618-0_4
Nello Cristianini, Thorsten Joachims, John Shawe-Taylor, Composite Kernels for Hypertext Categorisation international conference on machine learning. pp. 250- 257 ,(2001)
Hwanjo Yu, Jiong Yang, Jiawei Han, Classifying large data sets using SVMs with hierarchical clusters knowledge discovery and data mining. pp. 306- 315 ,(2003) , 10.1145/956750.956786
Shenghuo Zhu, Kai Yu, Yun Chi, Yihong Gong, Combining content and link for classification using matrix factorization international acm sigir conference on research and development in information retrieval. pp. 487- 494 ,(2007) , 10.1145/1277741.1277825
G.J. Tesauro, J.O. Kephart, G.B. Sorkin, Neural networks for computer virus recognition IEEE Intelligent Systems. ,vol. 11, pp. 5- 6 ,(1996) , 10.1109/64.511768