作者: João Gama , Luis Torgo , Carlos Soares
关键词: Sorting 、 Machine learning 、 Artificial intelligence 、 Discretization 、 Benchmark (computing) 、 Bayesian probability 、 Feature selection 、 Computer science 、 Naive Bayes classifier 、 Discretization of continuous features 、 Decision tree
摘要: Discretization of continuous attributes is an important task for certain types machine learning algorithms. Bayesian approaches, instance, require assumptions about data distributions. Decision Trees, on the other hand, sorting operations to deal with attributes, which largely increase times. This paper presents a new method discretization, whose main characteristic that it takes into account interdependencies between attributes. Detecting can be seen as discovering redundant means our performs attribute selection side effect discretization. Empirical evaluation five benchmark datasets from UCI repository, using C4.5 and naive Bayes, shows consistent reduction features without loss generalization accuracy.