E-mail authorship attribution using customized associative classification

作者: Michael R. Schmid , Farkhund Iqbal , Benjamin C.M. Fung

DOI: 10.1016/J.DIIN.2015.05.012

关键词:

摘要: E-mail communication is often abused for conducting social engineering attacks including spamming, phishing, identity theft and distributing malware. This largely attributed to the problem of anonymity inherent in standard electronic mail protocol. In literature, authorship attribution studied as a text categorization where writing styles individuals are modeled based on their previously written sample documents. The developed model employed identify most plausible writer text. Unfortunately, existing studies focus solely improving predictive accuracy not value evidence collected. this study, we propose customized associative classification technique, popular data mining method, address problem. Our approach models unique style features person, measures associativity these produces an intuitive classifier. results obtained by experiments real dataset reveal that presented method very effective.

参考文章(38)
Ben Allison, Louise Guthrie, Authorship Attribution of E-Mail: Comparing Classifiers over a New Corpus for Evaluation language resources and evaluation. ,(2008)
Ross Anderson, Chris Barton, Rainer Böhme, Richard Clayton, Michel J. G. van Eeten, Michael Levi, Tyler Moore, Stefan Savage, Measuring the Cost of Cybercrime workshop on the economics of information security. pp. 265- 300 ,(2013) , 10.1007/978-3-642-39498-0_12
George M. Mohay, Malcolm W. Corney, Olivier de Vel, Alison M. Anderson, Multi-Topic E-mail Authorship Attribution Forensics Proceedings ACM Conference on Computer Security - Workshop on Data Mining for Security Applications. ,(2001)
Jiawei Han, Xiaoxin Yin, CPAR: Classification based on Predictive Association Rules. siam international conference on data mining. pp. 331- 335 ,(2003)
Rong Zheng, Yi Qin, Zan Huang, Hsinchun Chen, Authorship analysis in cybercrime investigation intelligence and security informatics. pp. 59- 73 ,(2003) , 10.1007/3-540-44853-5_5
Nir Friedman, Dan Geiger, Moises Goldszmidt, Bayesian Network Classifiers Machine Learning. ,vol. 29, pp. 131- 163 ,(1997) , 10.1023/A:1007465528199
Fadi Thabtah, Peter Cowling, Yonghong Peng, None, MCAR: multi-class classification based on association rule acs ieee international conference on computer systems and applications. pp. 33- ,(2005) , 10.1109/AICCSA.2005.1387030
O. de Vel, A. Anderson, M. Corney, G. Mohay, Mining e-mail content for author identification forensics international conference on management of data. ,vol. 30, pp. 55- 64 ,(2001) , 10.1145/604264.604272
T. C. Mendenhall, THE CHARACTERISTIC CURVES OF COMPOSITION Science. ,vol. ns-9, pp. 237- 246 ,(1887) , 10.1126/SCIENCE.NS-9.214S.237