Exploring English lexicon knowledge for Chinese sentiment analysis

作者: Harith Alani , Yulan He , Deyu Zhou

DOI:

关键词:

摘要: This paper presents a weakly-supervised method for Chinese sentiment analysis by incorporating lexical prior knowledge obtained from English lexicons through machine translation. A mechanism is introduced to incorporate the information about polarity bearing words existing into latent Dirichlet allocation (LDA) where labels are considered as topics. Experiments on product reviews mobile phones, digital cameras, MP3 players, and monitors demonstrate feasibility effectiveness of proposed approach show that weakly supervised LDA model performs well classifiers such Naive Bayes Support vector Machines with an average 83% accuracy achieved over total 5484 review documents. Moreover, able extract highly domain-salient text.

参考文章(14)
Andrea Esuli, Fabrizio Sebastiani, SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining language resources and evaluation. pp. 417- 422 ,(2006)
Taras Zagibalov, John Carroll, Unsupervised classification of sentiment and objectivity in Chinese text international joint conference on natural language processing. pp. 304- 311 ,(2008)
Thomas P. Minka, Estimating a Dirichlet Distribution ,(2000)
Mikhail Bautin, Steven Skiena, Lohit Vijayarenu, International Sentiment Analysis for News and Blogs. international conference on weblogs and social media. ,(2008)
Songbo Tan, Yuefen Wang, Xueqi Cheng, Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08. pp. 743- 744 ,(2008) , 10.1145/1390334.1390481
Likun Qiu, Weishi Zhang, Changjian Hu, Kai Zhao, SELC Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09. pp. 929- 936 ,(2009) , 10.1145/1645953.1646072
Hsin-Hsi Chen, Lun-Wei Ku, Mining opinions from the Web: Beyond relevance retrieval Journal of the Association for Information Science and Technology. ,vol. 58, pp. 1838- 1850 ,(2007) , 10.1002/ASI.V58:12
Taras Zagibalov, John Carroll, Automatic Seed Word Selection for Unsupervised Sentiment Classification of Chinese Text international conference on computational linguistics. pp. 1073- 1080 ,(2008) , 10.3115/1599081.1599216
Xiaojun Wan, Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP '08. pp. 553- 561 ,(2008) , 10.3115/1613715.1613783
Carmen Banea, Rada Mihalcea, Janyce Wiebe, Samer Hassan, Multilingual subjectivity analysis using machine translation Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP '08. pp. 127- 135 ,(2008) , 10.3115/1613715.1613734