作者: Xingquan Zhu , Xindong Wu
DOI: 10.1007/S10462-004-0751-8
关键词: Data quality 、 Classifier (UML) 、 Data mining 、 Learning abilities 、 Artificial intelligence 、 Machine learning 、 Preprocessor 、 Noise measurement 、 Computer science
摘要: Real-world data is never perfect and can often suffer from corruptions (noise) that may impact interpretations of the data, models created decisions made based on data. Noise reduce system performance in terms classification accuracy, time building a classifier size classifier. Accordingly, most existing learning algorithms have integrated various approaches to enhance their abilities noisy environments, but existence noise still introduce serious negative impacts. A more reasonable solution might be employ some preprocessing mechanisms handle instances before learner formed. Unfortunately, rare research has been conducted systematically explore noise, especially handling point view. This processing techniques less significant, specifically when dealing with introduced attributes. In this paper, we present systematic evaluation effect machine learning. Instead taking any unified theory evaluate impacts, differentiate into two categories: class attribute analyze impacts separately. Because widely addressed efforts, concentrate noise. We investigate relationship between at different attributes, possible solutions Our conclusions used guide interested readers quality by designing mechanisms.