Customizing Sentiment Classifiers to New Domains: a Case Study

作者: Michael Gamon , Anthony Aue

DOI:

关键词: Computer scienceArtificial intelligenceDomain (software engineering)Labeled dataBase (topology)Machine learning

摘要: Sentiment classification is a very domain specific problem; classifiers trained in one do not perform well others. Unfortunately, many domains are lacking large amounts of labeled data for fully-supervised learning approaches. At the same time, sentiment need to be customizable new order useful practice. We attempt address these difficulties and constraints this paper, where we survey four different approaches customizing system target absence data. base our experiments on from domains. After establishing that naive cross-domain results poor accuracy, compare obtained by using each discuss their advantages, disadvantages performance.

参考文章(12)
J. Wiebe, Identifying Collocations for Recognizing Opinions meeting of the association for computational linguistics. ,(2001)
Ljupčo Todorovski, Sašo Džeroski, Combining Classifiers with Meta Decision Trees Machine Learning. ,vol. 50, pp. 223- 249 ,(2003) , 10.1023/A:1021709817809
Hong Yu, Vasileios Hatzivassiloglou, Towards answering opinion questions Proceedings of the 2003 conference on Empirical methods in natural language processing -. pp. 129- 136 ,(2003) , 10.3115/1119355.1119372
Kamal Nigam, Andrew Kachites McCallum, Sebastian Thrun, Tom Mitchell, Text Classification from Labeled and Unlabeled Documents using EM Machine Learning. ,vol. 39, pp. 103- 134 ,(2000) , 10.1023/A:1007692713085
Bo Pang, Lillian Lee, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts meeting of the association for computational linguistics. pp. 271- 278 ,(2004) , 10.3115/1218955.1218990
Ted Dunning, Accurate methods for the statistics of surprise and coincidence Computational Linguistics. ,vol. 19, pp. 61- 74 ,(1993)
Thorsten Joachims, Text Categorization with Suport Vector Machines: Learning with Many Relevant Features european conference on machine learning. ,vol. 1398, pp. 137- 142 ,(1998) , 10.1007/BFB0026683
Michael L. Littman, Peter D. Turney, Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus arXiv: Learning. ,(2002)
Bo Pang, Lillian Lee, Shivakumar Vaithyanathan, Thumbs up? Sentiment Classification using Machine Learning Techniques empirical methods in natural language processing. pp. 79- 86 ,(2002) , 10.3115/1118693.1118704