作者: Dengya Zhu , Kok Wai Wong
DOI: 10.1007/978-3-319-12637-1_60
关键词: Benchmark (computing) 、 Set (abstract data type) 、 Feature (machine learning) 、 Computer science 、 Boosting methods for object categorization 、 Naive Bayes classifier 、 Text categorization 、 Feature selection 、 AdaBoost 、 Machine learning 、 Artificial intelligence
摘要: Naive Bayes(NB), kNN and Adaboost are three commonly used text classifiers. Evaluation of these classifiers involves a variety factors to be considered including benchmark used, feature selections, parameter settings algorithms, the measurement criteria employed. Researchers have demonstrated that some algorithms outperform others on corpus, however, labeling corpus bias two concerns in categorization. This paper focuses evaluating by using an automatically generated document set which is labelled group experts alleviate subjectiveness labelling, at same time examine how performance influenced selection number features selected.