作者: Kevin S. Tickle , Mohammed M. Mazid , A. B. M. Shawkat Ali
关键词:
摘要: Rule based classification is one of the most popular way in data mining. There are number algorithms for rule classification. C4.5 and Partial Decision Tree (PART) very among them both have many empirical features such as continuous categorization, missing value handling, etc. However cases these takes more processing time provides less accuracy rate correctly classified instances. One main reasons high dimensionality databases. A large dataset might contain hundreds attributes with huge We need to choose related obtain higher accuracy. It also a difficult task proper algorithm perform efficient perfect With our proposed method, we select relevant from by reducing input space simultaneously improve performance two algorithms. The improved measured on better computational complexity. measure Entropy Information Theory identify central attribute dataset. Then apply correlation coefficient namely, Pearson's, Spearman Kendall utilizing same conducted comparative study using three measures best method. picked datasets well known repository UCI (University California Irvine) database. used box plot compare experimental results. Our method has showed individual experiment.