Document-Word Co-regularization for Semi-supervised Sentiment Analysis

作者: Vikas Sindhwani , Prem Melville

DOI: 10.1109/ICDM.2008.113

关键词:

摘要: The goal of sentiment prediction is to automatically identify whether a given piece text expresses positive or negative opinion towards topic interest. One can pose as standard categorization problem, but gathering labeled data turns out be bottleneck. Fortunately, background knowledge often available in the form prior information about polarity words lexicon. Moreover, many applications abundant unlabeled also available. In this paper, we propose novel semi-supervised algorithm that utilizes lexical conjunction with examples. Our method based on joint analysis documents and bipartite graph representation data. We present an empirical study diverse collection problems which confirms our models significantly outperform purely supervised competing techniques.

参考文章(28)
R. Cowell Z. Ghahramani, A Zien, O Chapelle, Semi-Supervised Classification by Low Density Separation international conference on artificial intelligence and statistics. pp. 57- 64 ,(2005)
Philip S. Yu, Wee Sun Lee, Bing Liu, Xiaoli Li, Text classification by labeling words national conference on artificial intelligence. pp. 425- 430 ,(2004)
Tomaso Poggio, Ryan Michael Rifkin, Everything old is new again: a fresh look at historical approaches in machine learning Everything old is new again: a fresh look at historical approaches in machine learning. pp. 1- 1 ,(2002)
Fan R K Chung, Spectral Graph Theory ,(1996)
Robert E. Schapire, Mazin G. Rahim, Narendra Gupta, Marie Rochery, Incorporating Prior Knowledge into Boosting international conference on machine learning. pp. 538- 545 ,(2002)
Alexander J Smola, Risi Kondor, None, Kernels and Regularization on Graphs Learning Theory and Kernel Machines. pp. 144- 158 ,(2003) , 10.1007/978-3-540-45167-9_12
Aynur Dayanik, David D. Lewis, David Madigan, Vladimir Menkov, Alexander Genkin, Constructing informative prior distributions from domain knowledge in text classification Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06. pp. 493- 500 ,(2006) , 10.1145/1148170.1148255
Ganesh Ramakrishnan, Apurva Jadhav, Ashutosh Joshi, Soumen Chakrabarti, Pushpak Bhattacharyya, Question Answering via Bayesian Inference on Lexical Relations meeting of the association for computational linguistics. pp. 1- 10 ,(2003) , 10.3115/1119312.1119313
Xiaoyun Wu, Rohini Srihari, Incorporating prior knowledge with weighted margin support vector machines Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '04. pp. 326- 333 ,(2004) , 10.1145/1014052.1014089
Theresa Wilson, Janyce Wiebe, Paul Hoffmann, Recognizing contextual polarity in phrase-level sentiment analysis Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05. pp. 347- 354 ,(2005) , 10.3115/1220575.1220619