作者: Vo Ngoc Phu , Vo Thi Ngoc Tran
DOI: 10.3844/AJEASSP.2018.38.65
关键词: Term (time) 、 Natural language processing 、 Test data 、 Similarity (psychology) 、 Artificial intelligence 、 Field (computer science) 、 Set (abstract data type) 、 Big data 、 Word (computer architecture) 、 Computer science 、 Phrase
摘要: Sentiment classification is significant in everyday life, such as political activities, commodity production and commercial activities. In this survey, we have proposed a new model for Big Data sentiment classification. We use many lexicons of our basis English Dictionary (bESD) to classify 5,000,000 documents including 2,500,000 positive negative testing data set English. do not any training one-dimensional vector both sequential environment distributed network system. also multi-dimensional system parallel environment. JOHNSON Coefficient (JC) through Google search engine with AND operator OR identify values the bESD One term (a word or phrase English) clustered into either polarity if very close by using similarity measures JC. It means that similar negative. tested achieved 87.56% accuracy set. The execution time faster than Our can millions based on depending special domain stage. This survey used coefficients mining field. results work be widely applications research