作者: Dewan Md. Farid
DOI: 10.1109/ICCITECHN.2011.6164874
关键词: Discretization of continuous features 、 Machine learning 、 Interval (mathematics) 、 Computer science 、 Naive Bayes classifier 、 Decision tree learning 、 Decision tree 、 Heuristic 、 Benchmark (computing) 、 Artificial intelligence 、 Discretization 、 Data mining
摘要: Dealing with continuous-valued attributes is an important data mining problem that has effects on accuracy, complexity, and understandability of the algorithms. This paper presents a new approach for dealing continuous improve quality discretization as preprocessing step decision tree naive Bayesian classifier. The proposed focus supervised discretization, however, unsupervised can also be applied in same way. It finds possible cut points attribute values separate class distributions, then consider best point interval border information gain heuristic been tested by comparing other methods number benchmark problems from UCI machine learning repository. experimental results proved improves discretization.