An Efficient Heuristic for Discovering Multiple Ill-Defined Attributes in Datasets

作者: Sylvain Halle

DOI: 10.1109/ICMLA.2006.14

关键词:

摘要: The accuracy of the rules produced by a concept learning system can be hindered presence errors in data, such as "ill-defined" attributes that are too general or specific for to learn. In this paper, we devise method uses Boolean differences computed program called Newton identify multiple ill-defined dataset single pass. is based on compound heuristic assigns real-valued rank each possible hypothesis its key characteristics. We show extensive empirical testing randomly generated classifiers with highest correct one an observed probability quickly converging 100%. Moreover, monotonicity function enables us use it rough estimator own likelihood.

参考文章(1)
Dirk Ourston, Raymond J Mooney, Theory Refinement with Noisy Data University of Texas at Austin. ,(1991)