English Sentiment Classification using Only the Sentiment Lexicons with a JOHNSON Coefficient in a Parallel Network Environment

作者: Vo Ngoc Phu , Vo Thi Ngoc Tran

DOI: 10.3844/AJEASSP.2018.38.65

关键词: Term (time)Natural language processingTest dataSimilarity (psychology)Artificial intelligenceField (computer science)Set (abstract data type)Big dataWord (computer architecture)Computer sciencePhrase

摘要: Sentiment classification is significant in everyday life, such as political activities, commodity production and commercial activities. In this survey, we have proposed a new model for Big Data sentiment classification. We use many lexicons of our basis English Dictionary (bESD) to classify 5,000,000 documents including 2,500,000 positive negative testing data set English. do not any training one-dimensional vector both sequential environment distributed network system. also multi-dimensional system parallel environment. JOHNSON Coefficient (JC) through Google search engine with AND operator OR identify values the bESD One term (a word or phrase English) clustered into either polarity if very close by using similarity measures JC. It means that similar negative. tested achieved 87.56% accuracy set. The execution time faster than Our can millions based on depending special domain stage. This survey used coefficients mining field. results work be widely applications research

参考文章(34)
Guy W. Mineau, Pascal Soucy, Beyond TFIDF weighting for text categorization in the vector space model international joint conference on artificial intelligence. pp. 1130- 1135 ,(2005)
Xiang Ji, Soon Ae Chun, Zhi Wei, James Geller, Twitter sentiment classification for measuring public health concerns Social Network Analysis and Mining. ,vol. 5, pp. 13- ,(2015) , 10.1007/S13278-015-0253-5
Masaru Kitsuregawa, Naoki Yoshinaga, Masashi Toyoda, Yong Ren, Nobuhiro Kaji, Sentiment Classification in Resource-Scarce Languages by using Label Propagation pacific asia conference on language information and computation. pp. 420- 429 ,(2011)
Tao Jiang, Jing Jiang, Yugang Dai, Ailing Li, Micro–blog Emotion Orientation Analysis Algorithm Based on Tibetan and Chinese Mixed Text Proceedings of the 1st International Symposium on Social Science (isss-15). pp. 157- 162 ,(2015) , 10.2991/ISSS-15.2015.39
José Alfredo Hernández-Ugalde, Jorge Mora-Urpí, Oscar J. Rocha, Genetic relationships among wild and cultivated populations of peach palm ( Bactris gasipaes Kunth, Palmae): evidence for multiple independent domestication events Genetic Resources and Crop Evolution. ,vol. 58, pp. 571- 583 ,(2011) , 10.1007/S10722-010-9600-6
Ziqing Zhang, Qiang Ye, Wenying Zheng, Yijun Li, Sentiment Classification for Consumer Word-of-Mouth in Chinese: Comparison between Supervised and Unsupervised Approaches international conference on e business. pp. 405- 411 ,(2010) , 10.2991/ICEBI.2010.56
Júlia Tamás, János Podani, Péter Csontos, An extension of presence/absence coefficients to abundance data: a new look at absence Journal of Vegetation Science. ,vol. 12, pp. 401- 410 ,(2001) , 10.2307/3236854
Jair Moura Duarte, João Bosco dos Santos, Leonardo Cunha Melo, Comparison of similarity coefficients based on RAPD markers in the common bean Genetics and Molecular Biology. ,vol. 22, pp. 427- 432 ,(1999) , 10.1590/S1415-47571999000300024
S TAN, J ZHANG, An empirical study of sentiment analysis for chinese documents Expert Systems With Applications. ,vol. 34, pp. 2622- 2629 ,(2008) , 10.1016/J.ESWA.2007.05.028
Weifu Du, Songbo Tan, Xueqi Cheng, Xiaochun Yun, Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon web search and data mining. pp. 111- 120 ,(2010) , 10.1145/1718487.1718502