作者: Swati Agarwal , Ashish Sureka
DOI: 10.1007/978-3-319-14977-6_47
关键词:
摘要: Twitter is the largest and most popular micro-blogging website on Internet. Due to low publication barrier, anonymity wide penetration, has become an easy target or platform for extremists disseminate their ideologies opinions by posting hate extremism promoting tweets. Millions of tweets are posted everyday it practically impossible moderators intelligence security analyst manually identify such tweets, users communities. However, automatic classification into pre-defined categories a non-trivial problem due short text tweet maximum length can be 140 characters noisy content incorrect grammar, spelling mistakes, presence standard non-standard abbreviations slang. We frame detection as one-class unary-class categorization learning statistical model from training set containing only objects one class . propose several linguistic features war, religious, negative emotions offensive terms discriminate other employ single-class SVM KNN algorithm task. conduct case-study Jihad, perform characterization study measure precision recall machine-learning based classifier. Experimental results large real-world dataset demonstrate that proposed approach effective with F-score 0.60 0.83 classifier respectively.