作者: Tim Niblett , Peter Clark
DOI:
关键词: Noise (video) 、 Mathematics 、 Data set 、 Almost surely 、 Imperfect 、 Class (set theory) 、 Domain (software engineering) 、 Measuring equipment 、 Data mining 、 Counterexample
摘要: This paper examines the induction of classification rules from examples using real-world data. Real-world data is almost always characterized by two features, which are important for design an algorithm. Firstly, there often noise present, example, due to imperfect measuring equipment used collect Secondly description language incomplete, such that with identical descriptions in will not be members same class. Many systems make ‘noiseless domain’ assumption do contain errors and complete, consequently constrain their search those no counterexamples exist induction. However, domains correlations between attributes classes a set rarely without exceptions. To locate induce describing them it also necessary consider may classify all training correctly. firstly discusses some problems presented proposes top-down algorithm domains. Secondly, experimental comparison this other three sets medical