FindMal: A file-to-file social network based malware detection framework

作者: Ming Ni , Tao Li , Qianmu Li , Hong Zhang , Yanfang Ye

DOI: 10.1016/J.KNOSYS.2016.09.004

关键词: Artificial intelligenceConstruct (python library)Node (networking)MalwareInternet securityData miningGraph (abstract data type)Machine learningComputer scienceActive learning (machine learning)Sample (statistics)

摘要: Abstract The rapid development of malicious software programs has posed severe threats to Computer and Internet security. Therefore, it motivates anti-malware vendors researchers develop novel methods which are capable protecting users against new threats. Existing malware detectors mostly treat the file samples separately using supervised learning algorithms. However, ignoring relationship among limits capability detectors. In this paper, based on file-to-file social network, we present a detection framework, FindMal( F ile-to-File Soc i al N etwork base d Mal ware Detection Framework), including graph-based features extraction, Label Propagation algorithm, active strategy. Nearest neighbors first chosen as adjacent nodes for each node construct kNN relation graph. Three graph proposed sample representative labeling. Then, propagates label information from labeled unlabeled files, is applied learn probability that one unknown classified or benign. A batch mode method employed reduce labeling cost improve performance Propagation. Comprehensive experiments real large scale dataset obtained an company performed. results demonstrate our FindMal outperforms other existing models in classifying samples.

参考文章(36)
Andrei Venzhega, Polina Zhinalieva, Nikolay Suboch, Graph-based malware distributors detection the web conference. pp. 1141- 1144 ,(2013) , 10.1145/2487788.2488136
Wentao Zhao, Jun Long, En Zhu, Yun Liu, A Scalable Algorithm for Graph-Based Active Learning Frontiers in Algorithmics. pp. 311- 322 ,(2008) , 10.1007/978-3-540-69311-6_32
Xiaojin ZhuЃ, Zoubin GhahramaniЃн, None, Learning from labeled and unlabeled data with label propagation Center for Automated Learning and Discovery, CMU: Carnegie Mellon University, USA.. ,(2002)
Nicolò Cesa-Bianchi, Giovanni Zappella, Fabio Vitale, Claudio Gentile, Active Learning on Trees and Graphs conference on learning theory. pp. 320- 332 ,(2010)
Hieu T. Nguyen, Arnold Smeulders, Active learning using pre-clustering international conference on machine learning. pp. 79- ,(2004) , 10.1145/1015330.1015349
Yanfang Ye, Tao Li, Shenghuo Zhu, Weiwei Zhuang, Egemen Tas, Umesh Gupta, Melih Abdulhayoglu, Combining file content and file relations for cloud based malware detection Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11. pp. 222- 230 ,(2011) , 10.1145/2020408.2020448
Nir Nissim, Robert Moskovitch, Lior Rokach, Yuval Elovici, Detecting unknown computer worm activity via support vector machines and active learning Pattern Analysis and Applications. ,vol. 15, pp. 459- 475 ,(2012) , 10.1007/S10044-012-0296-4
D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, N. Weaver, Inside the Slammer worm ieee symposium on security and privacy. ,vol. 1, pp. 33- 39 ,(2003) , 10.1109/MSECP.2003.1219056
Eric Filiol, Malware Pattern Scanning Schemes Secure Against Black-box Analysis Journal in Computer Virology. ,vol. 2, pp. 35- 50 ,(2006) , 10.1007/S11416-006-0009-X
Nir Nissim, Robert Moskovitch, Lior Rokach, Yuval Elovici, Novel active learning methods for enhanced PC malware detection in windows OS Expert Systems With Applications. ,vol. 41, pp. 5843- 5857 ,(2014) , 10.1016/J.ESWA.2014.02.053