Authorship analysis in cybercrime investigation

作者: Rong Zheng , Yi Qin , Zan Huang , Hsinchun Chen

DOI: 10.1007/3-540-44853-5_5

关键词:

摘要: Criminals have been using the Internet to distribute a wide range of illegal materials globally in an anonymous manner, making criminal identity tracing difficult cybercrime investigation process. In this study we propose adopt authorship analysis framework automatically trace identities cyber criminals through messages they post on Internet. Under framework, three types message features, including style markers, structural and content-specific are extracted inductive learning algorithms used build feature-based models identify messages. To evaluate effectiveness conducted experimental data sets English Chinese email online newsgroup We experimented with all features algorithms. The results indicate that proposed approach can discover real authors both relatively high accuracies.

参考文章(39)
Brian D Loader, Douglas Thomas, Cybercrime : law enforcement, security and surveillance in the information age Published in <b>2000</b> in London by Routledge. ,(2000)
Kamal Nigam, Andrew McCallum, A comparison of event models for naive bayes text classification national conference on artificial intelligence. pp. 41- 48 ,(1998)
Thorsten Joachims, Text categorization with support vector machines Universität Dortmund. ,(1999) , 10.17877/DE290R-5097
Nello Cristianini, J Shawe-Taylor, An introduction to Support Vector Machines Cambridge University Press (2000). ,(2000)
Joachim Diederich, Jörg Kindermann, Edda Leopold, Gerhard Paass, Authorship Attribution with Support Vector Machines Applied Intelligence. ,vol. 19, pp. 109- 123 ,(2003) , 10.1023/A:1023824908771
O. de Vel, A. Anderson, M. Corney, G. Mohay, Mining e-mail content for author identification forensics international conference on management of data. ,vol. 30, pp. 55- 64 ,(2001) , 10.1145/604264.604272
Hsinchun Chen, Ganesan Shankaranarayanan, Linlin She, Anand Iyer, A machine learning approach to inductive query by examples: an experiment using relevance feedback, ID3, genetic algorithms, and simulated annealing Journal of the Association for Information Science and Technology. ,vol. 49, pp. 639- 705 ,(1998) , 10.1002/(SICI)1097-4571(199806)49:8<693::AID-ASI4>3.0.CO;2-O