WEAC: Word embeddings for anomaly classification from event logs

作者: Amit Pande , Vishal Ahuja

DOI: 10.1109/BIGDATA.2017.8258034

关键词:

摘要: Dramatic progress has been made in the usage of semantic word embeddings for solving analogy tasks recent years. Word or vector representation words key to many advances natural language processing. This paper presents a novel application Word-Embeddings Anomaly Classification (WEAC), where we detect whether an event log entry is anomalous one not. Additionally, WEAC helps us classify anomaly by identifying feature(s) log. For example, unusual network activity such as store transaction server logging into dropbox.com would be automatically flagged because wrong feature associations entries corresponding works with two training models: Skip-Gram (SG) and Continuous Bag Words (CBOW). Negative sampling used boost training. The initial results on wikipedia text8 dataset, well investigation enterprise HTTP logs are promising. model achieved average detection rate 65–100% classification accuracy 85–100%. was superior state-of-the-art techniques.

参考文章(22)
Eleazar Eskin, Salvatore J. Stolfo, Andrew Honig, Andrew Howard, System and methods for adaptive model generation for detecting intrusion in computer systems ,(2014)
Eleazar Eskin, Leonid Portnoy, Salvatore J. Stolfo, Michael Prerau, Andrew Oliver Arnold, Methods of unsupervised anomaly detection using a geometric framework ,(2013)
Kamaldeep Singh, Sharath Chandra Guntuku, Abhishek Thakur, Chittaranjan Hota, Big Data Analytics framework for Peer-to-Peer Botnet detection using Random Forests Information Sciences. ,vol. 278, pp. 488- 497 ,(2014) , 10.1016/J.INS.2014.03.066
Zheng Wang, POSTER: On the Capability of DNS Cache Poisoning Attacks computer and communications security. pp. 1523- 1525 ,(2014) , 10.1145/2660267.2662363
Bai Xue, Chen Fu, Zhan Shaobin, A Study on Sentiment Computing and Classification of Sina Weibo with Word2vec international congress on big data. pp. 358- 363 ,(2014) , 10.1109/BIGDATA.CONGRESS.2014.59
David J. Weller-Fahy, Brett J. Borghetti, Angela A. Sodemann, A Survey of Distance and Similarity Measures Used Within Network Intrusion Anomaly Detection IEEE Communications Surveys and Tutorials. ,vol. 17, pp. 70- 91 ,(2015) , 10.1109/COMST.2014.2336610
Sandeep Yadav, Ashwath Kumar Krishna Reddy, A.L. Narasimha Reddy, Supranamaya Ranjan, Detecting algorithmically generated malicious domain names internet measurement conference. pp. 48- 61 ,(2010) , 10.1145/1879141.1879148
Bernhard Tellenbach, Martin Burkhart, Dominik Schatzmann, David Gugelmann, Didier Sornette, Accurate network anomaly classification with generalized entropy metrics Computer Networks. ,vol. 55, pp. 3485- 3502 ,(2011) , 10.1016/J.COMNET.2011.07.008