作者: Mariela Cerrada , René-Vinicio Sánchez , Fannia Pacheco , Diego Cabrera , Grover Zurita
DOI: 10.1007/S10489-015-0725-3
关键词:
摘要: Feature selection is an important aspect under study in machine learning based diagnosis, that aims to remove irrelevant features for reaching good performance the diagnostic systems. The behaviour of models could be sensitive with regard amount features, and significant can represent problem better than entire set. Consequently, algorithms identify these are valuable contributions. This work deals feature through attribute clustering. proposed algorithm inspired by existing approaches, where relative dependency between attributes used calculate dissimilarity values. centroids created clusters selected as representative attributes. uses a random process proposing centroid candidates, this way, inherent exploration search included. A hierarchical procedure implementing algorithm. In each level hierarchy, set available split disjoint sets applied on subset. Once subset, new runs again next level. implementation refine space reduced attributes, while computational time-consumption improved also. approach tested real data collected from test bed, results show diagnosis precision using Random Forest classifier over 98 % only 12