Analysis of Sentiment Labeling of Slovene User-Generated Content

作者: Darja Fišer , Tomaž Erjavec

DOI:

关键词:

摘要: The paper takes a close look at the results of sentiment annotation Janes corpus Slovene user-generated content on 557 texts sampled from 5 text genres. A comparison disagreements among three human annotators is examined genre as well level. Next, we compare automatically and manually assigned labels according to genre. effect correct assignment further investigated by investigating with no inter-annotator agreement. We then into for full agreement but different automatic classification. Finally, examine that humans model struggled most.

参考文章(8)
Adam Kilgarriff, Getting to Know Your Corpus text speech and dialogue. pp. 3- 15 ,(2012) , 10.1007/978-3-642-32790-2_1
Nikola Ljubešić, Tomaž Erjavec, Darja Fišer, Standardizing Tweets with Character-Level Machine Translation Computational Linguistics and Intelligent Text Processing. pp. 164- 175 ,(2014) , 10.1007/978-3-642-54903-8_14
Peter Sheridan Dodds, Eric M Clark, Suma Desu, Morgan R Frank, Andrew J Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M Kloumann, James P Bagrow, Karine Megerdoomian, Matthew T McMahon, Brian F Tivnan, Christopher M Danforth, None, Human language reveals a universal positivity bias Proceedings of the National Academy of Sciences of the United States of America. ,vol. 112, pp. 2389- 2394 ,(2015) , 10.1073/PNAS.1411678112
Tim Finin, Justin Christopher Martineau, Delta TFIDF: An Improved Feature Space for Sentiment Analysis international conference on weblogs and social media. ,(2009) , 10.13016/M2WD3Q54V
Igor Mozetič, Miha Grčar, Jasmina Smailović, Multilingual Twitter Sentiment Classification: The Role of Human Annotators PLOS ONE. ,vol. 11, ,(2016) , 10.1371/JOURNAL.PONE.0155036
Nikola Ljubešić, Darja Fišer, Dafne Marko, Iza Škrjanec, Jaka Čibej, Senja Pollak, Tomaž Erjavec, Predicting the Level of Text Standardness in User-generated Content recent advances in natural language processing. pp. 371- 378 ,(2015)
Nikola Ljubesic, Tomaz Erjavec, Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene language resources and evaluation. pp. 1527- 1531 ,(2016)