作者: Flavien Bouillot , Pascal Poncelet , Mathieu Roche
DOI: 10.1007/978-3-319-08326-1_35
关键词: Naive Bayes classifier 、 Classifier (UML) 、 Classification methods 、 Artificial intelligence 、 Weighting 、 tf–idf 、 Pattern recognition 、 Computer science 、 Small number
摘要: In text classification, providing an efficient classifier even if the number of documents involved in learning step is small remains important issue. this paper we evaluate performance traditional classification methods to better their limitation phase when dealing with amount documents. We thus propose a new way for weighting features which are used classifying. These have been integrated two well known classifiers: Class-Feature-Centroid and Naive Bayes, evaluations performed on real datasets. also investigated influence parameters such as classes, or words classification. Experiments shown efficiency our proposal relatively state art methods. Either very few data that can be extracted from poor content documents, show approach performs well.