Predicting Emotion Labels for Chinese Microblog Texts

作者: Zheng Yuan , Matthew Purver

DOI: 10.1007/978-3-319-18458-6_7

关键词:

摘要: We describe an experiment into detecting emotions in texts on the Chinese microblog service Sina Weibo (www.weibo.com) using distant supervision via various author-supplied emotion labels (emoticons and smilies). Existing word segmentation tools proved unreliable; better accuracy was achieved character-based features. Higher-order n-grams to be useful Accuracy varied according label emotion: while smilies are used more often, emoticons reliable. Happiness is most accurately predicted emotion, with accuracies around 90 % both gold-standard labels. This approach works well achieves high for happiness anger, it less effective sadness, surprise, disgust fear, which also difficult human annotators detect.

参考文章(43)
Patrick Paroubek, Alexander Pak, Twitter as a Corpus for Sentiment Analysis and Opinion Mining language resources and evaluation. ,(2010)
Tsutomu Endo, Kazutaka Shimada, Kimitaka Tsutsumi, Movie Review Classification Based on a Multiple Classifier pacific asia conference on language information and computation. pp. 481- 488 ,(2007)
Matthew Purver, Stuart Battersby, Experimenting with Distant Supervision for Emotion Classification conference of the european chapter of the association for computational linguistics. pp. 482- 491 ,(2012)
Preslav Nakov, Noun Compound Interpretation Using Paraphrasing Verbs: Feasibility Study Artificial Intelligence: Methodology, Systems, and Applications. pp. 103- 117 ,(2008) , 10.1007/978-3-540-85776-1_10
Ron Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection international joint conference on artificial intelligence. ,vol. 2, pp. 1137- 1143 ,(1995)
Jin Guo, Critical tokenization and its properties Computational Linguistics. ,vol. 23, pp. 569- 596 ,(1997)
Rion Snow, Brendan O'Connor, Daniel Jurafsky, Andrew Y. Ng, Cheap and fast---but is it good? Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP '08. pp. 254- 263 ,(2008) , 10.3115/1613715.1613751
Nianwen Xue, Libin Shen, Chinese Word Segmentation as LMR Tagging Proceedings of the Second SIGHAN Workshop on Chinese Language Processing. pp. 176- 179 ,(2003) , 10.3115/1119250.1119278
Seymour Geisser, The Predictive Sample Reuse Method with Applications Journal of the American Statistical Association. ,vol. 70, pp. 320- 328 ,(1975) , 10.1080/01621459.1975.10479865
Vasileios Hatzivassiloglou, Janyce M. Wiebe, Effects of adjective orientation and gradability on sentence subjectivity international conference on computational linguistics. pp. 299- 305 ,(2000) , 10.3115/990820.990864