Baseline evaluation: an empirical study of the performance of machine learning algorithms in short snippet sentiment analysis

作者: Farag Saad

DOI: 10.1145/2637748.2638420

关键词:

摘要: At present, the sentiment analysis task is a relatively recent research field which has led to many inconsistent findings in literature. The debate two-fold: what best performing baseline classifier and most useful feature weighting method e.g., term presence (TP), frequency (TF), TF-IDF etc., can be used improve classifier's performance. Naive Bayes, with its variations Support Vector Machine are commonly task. However, their reported performance varies among researchers divergence as that In order shed some light on this controversy, we have conducted series of widely comparative experiments (including twelve various domains) evaluate machine learning classifiers (Naive Bayes variations, J48 - an implementation decision tree-based -) experimental results indicate Binarized Multinomial (BMNB) exhibits short snippet Furthermore, classification performance, using selection methods, namely information gain (IG), been significantly improved.

参考文章(29)
John Lafferty, Kamal Nigam, Andrew McCallum, Using Maximum Entropy for Text Classification ,(1999)
Farag Saad, Brigitte Mathiak, Revised mutual information approach for german text sentiment classification the web conference. pp. 579- 586 ,(2013) , 10.1145/2487788.2487997
Rebecca Hwa, Janyce Wiebe, Theresa Wilson, Just how mad are you? finding strong and weak opinion clauses national conference on artificial intelligence. pp. 761- 767 ,(2004)
Zhongwu Zhai, Hua Xu, Jun Li, Peifa Jia, Feature subsumption for sentiment classification in multiple languages knowledge discovery and data mining. pp. 261- 271 ,(2010) , 10.1007/978-3-642-13672-6_26
Ron Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection international joint conference on artificial intelligence. ,vol. 2, pp. 1137- 1143 ,(1995)
Richard D. Lawrence, Prem Melville, Wojciech Gryc, Sentiment analysis of blogs by combining lexical knowledge with text classification Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09. pp. 1275- 1284 ,(2009) , 10.1145/1557019.1557156
Yuming Lin, Jingwei Zhang, Xiaoling Wang, Aoying Zhou, An information theoretic approach to sentiment polarity classification Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality - WebQuality '12. pp. 35- 40 ,(2012) , 10.1145/2184305.2184313
Suge Wang, Deyu Li, Xiaolei Song, Yingjie Wei, Hongxia Li, A feature selection method based on improved fisher's discriminant ratio for text sentiment classification Expert Systems With Applications. ,vol. 38, pp. 8696- 8702 ,(2011) , 10.1016/J.ESWA.2011.01.077
Hong Yu, Vasileios Hatzivassiloglou, Towards answering opinion questions Proceedings of the 2003 conference on Empirical methods in natural language processing -. pp. 129- 136 ,(2003) , 10.3115/1119355.1119372
S TAN, J ZHANG, An empirical study of sentiment analysis for chinese documents Expert Systems With Applications. ,vol. 34, pp. 2622- 2629 ,(2008) , 10.1016/J.ESWA.2007.05.028