Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter

作者: Swati Agarwal , Ashish Sureka

DOI: 10.1007/978-3-319-14977-6_47

关键词:

摘要: Twitter is the largest and most popular micro-blogging website on Internet. Due to low publication barrier, anonymity wide penetration, has become an easy target or platform for extremists disseminate their ideologies opinions by posting hate extremism promoting tweets. Millions of tweets are posted everyday it practically impossible moderators intelligence security analyst manually identify such tweets, users communities. However, automatic classification into pre-defined categories a non-trivial problem due short text tweet maximum length can be 140 characters noisy content incorrect grammar, spelling mistakes, presence standard non-standard abbreviations slang. We frame detection as one-class unary-class categorization learning statistical model from training set containing only objects one class . propose several linguistic features war, religious, negative emotions offensive terms discriminate other employ single-class SVM KNN algorithm task. conduct case-study Jihad, perform characterization study measure precision recall machine-learning based classifier. Experimental results large real-world dataset demonstrate that proposed approach effective with F-score 0.60 0.83 classifier respectively.

参考文章(13)
Yuzhou Wang, Irene Kwok, Locate the hate: detecting tweets against blacks national conference on artificial intelligence. pp. 1621- 1622 ,(2013)
Ashish Sureka, Swati Agarwal, Learning to Classify Hate and Extremism Promoting Tweets intelligence and security informatics. pp. 320- 320 ,(2014) , 10.1109/JISIC.2014.65
Pooja Wadhwa, M.P.S Bhatia, Tracking on-line radicalization using investigative data mining national conference on communications. pp. 1- 5 ,(2013) , 10.1109/NCC.2013.6488046
Min-Chul Yang, Jung-Tae Lee, Seung-Wook Lee, Hae-Chang Rim, Finding interesting posts in Twitter based on retweet graph analysis Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '12. pp. 1073- 1074 ,(2012) , 10.1145/2348283.2348475
Antonio Reyes, Paolo Rosso, Davide Buscaldi, From humor recognition to irony detection: The figurative language of social media data and knowledge engineering. ,vol. 74, pp. 1- 12 ,(2012) , 10.1016/J.DATAK.2012.02.005
Guang Xiang, Bin Fan, Ling Wang, Jason Hong, Carolyn Rose, Detecting offensive tweets via topical feature discovery over a large scale twitter corpus Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12. pp. 1980- 1984 ,(2012) , 10.1145/2396761.2398556
Derek O'Callaghan, Derek Greene, Maura Conway, Joe Carthy, Pádraig Cunningham, Uncovering the wider structure of extreme right communities spanning popular online networks Proceedings of the 5th Annual ACM Web Science Conference on - WebSci '13. pp. 276- 285 ,(2013) , 10.1145/2464464.2464495
Juan Martinez-Romo, Lourdes Araujo, Detecting malicious tweets in trending topics using a statistical analysis of language Expert Systems With Applications. ,vol. 40, pp. 2992- 3000 ,(2013) , 10.1016/J.ESWA.2012.12.015
Florian Kunneman, Antal Van den Bosch, Christine Liebrecht, The perfect solution for detecting sarcasm in tweets #not north american chapter of the association for computational linguistics. pp. 29- 37 ,(2013)
Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang, Towards social data platform Proceedings of the VLDB Endowment. ,vol. 6, pp. 1966- 1977 ,(2013) , 10.14778/2556549.2556577