作者: Uma Subramanian , Hang See Ong
关键词:
摘要: This paper presents the analysis of effect clustering training data and test in classification efficiency Naive Bayes classifier. KDD cup 99 benchmark dataset is used this research. The set clustered using k means algorithm into 5 clusters. Then 8800 samples are taken from clusters to form set. results compared with that two classifiers trained on random sampled containing 17600 instances respectively. main contribution it empirically proved derived generated by k-means improves show accuracy classifier 94.4% while normal 85.41% 89.26%.