Unsupervised and knowledge-poor approaches to sentiment analysis

作者: Taras Zagibalov

DOI:

关键词:

摘要: Sentiment analysis focuses upon automatic classiffication of a document's sentiment (and more generally extraction opinion from text). Ways expressing have been shown to be dependent on what document is about (domain-dependency). This complicates supervised methods for which rely extensive use training data or linguistic resources that are usually either domain-specific generic. Both kinds prevent classiffiers performing well across range domains, as this requires appropriate in-domain (domain-specific) data. This thesis presents novel unsupervised, knowledge-poor approach aimed at creating domain-independent and multilingual system. The extracts documents processed, uses them analysis. does not require any corpora, large sets rules generic lexicons, makes it domain- languageindependent but the same time able utilise language-specific information. The describes tests approach, applied diffeerent data, including customer reviews various types products, films books, news items; four languages: Chinese, English, Russian Japanese. The only binary classiffication, also three-way (positive, negative neutral), subjectivity classifiation sentences, holders targets. Experimental results suggest often viable alternative systems, especially when collections.

参考文章(116)
Stephen D. Durbin, A system for affective rating of texts ,(2003)
Andrea Esuli, Fabrizio Sebastiani, SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining language resources and evaluation. pp. 417- 422 ,(2006)
Janyce Wiebe, Theresa Wilson, Annotating Opinions in the World Press annual meeting of the special interest group on discourse and dialogue. pp. 13- 22 ,(2003)
Philip Resnik, Stephan Charles Greene, Spin: lexical semantics, transitivity, and the identification of implicit sentiment University of Maryland at College Park. ,(2007)
Taras Zagibalov, John Carroll, Unsupervised classification of sentiment and objectivity in Chinese text international joint conference on natural language processing. pp. 304- 311 ,(2008)
Taras Zagibalov, John Carroll, Almost-unsupervised cross-language opinion analysis at NTCIR-7 NTCIR. ,(2008)
Marco Baroni, S. Vegnaduzzo, Identifying subjective adjectives through web-based mutual information Proceedings of KONVENS 2004. pp. 17- 24 ,(2004)