Robust Sentiment Detection on Twitter from Biased and Noisy Data

作者: Junlan Feng , Luciano Barbosa

DOI:

关键词:

摘要: In this paper, we propose an approach to automatically detect sentiments on Twitter messages (tweets) that explores some characteristics of how tweets are written and meta-information the words compose these messages. Moreover, leverage sources noisy labels as our training data. These were provided by a few sentiment detection websites over twitter experiments, show since features able capture more abstract representation tweets, solution is effective than previous ones also robust regarding biased data, which kind data sources.

参考文章(15)
Janyce Wiebe, Ellen Riloff, Creating Subjective and Objective Sentence Classifiers from Unannotated Texts Computational Linguistics and Intelligent Text Processing. ,vol. 3406, pp. 486- 497 ,(2005) , 10.1007/978-3-540-30586-6_53
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Theresa Wilson, Janyce Wiebe, Paul Hoffmann, Recognizing contextual polarity in phrase-level sentiment analysis Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05. pp. 347- 354 ,(2005) , 10.3115/1220575.1220619
Jacob Cohen, A Coefficient of agreement for nominal Scales Educational and Psychological Measurement. ,vol. 20, pp. 37- 46 ,(1960) , 10.1177/001316446002000104
Ellen Riloff, Janyce Wiebe, Learning extraction patterns for subjective expressions Proceedings of the 2003 conference on Empirical methods in natural language processing -. pp. 105- 112 ,(2003) , 10.3115/1119355.1119369
Ellen Riloff, Janyce Wiebe, Theresa Wilson, Learning subjective nouns using extraction pattern bootstrapping north american chapter of the association for computational linguistics. pp. 25- 32 ,(2003) , 10.3115/1119176.1119180
Bo Pang, Lillian Lee, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts meeting of the association for computational linguistics. pp. 271- 278 ,(2004) , 10.3115/1218955.1218990
Natalie Glance, Matthew Hurst, Kamal Nigam, Matthew Siegler, Robert Stockton, Takashi Tomokiyo, Deriving marketing intelligence from online discussion knowledge discovery and data mining. pp. 419- 428 ,(2005) , 10.1145/1081870.1081919
Ellen Riloff, Siddharth Patwardhan, Janyce Wiebe, Feature Subsumption for Opinion Analysis empirical methods in natural language processing. pp. 440- 448 ,(2006) , 10.3115/1610075.1610137
Victor S. Sheng, Foster Provost, Panagiotis G. Ipeirotis, Get another label? improving data quality and data mining using multiple, noisy labelers Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08. pp. 614- 622 ,(2008) , 10.1145/1401890.1401965