作者: Khaled Fawagreh , Mohamed Medhat Gaber , Eyad Elyan
DOI: 10.1007/978-3-319-25032-8_4
关键词:
摘要: Random Forest (RF) is an ensemble supervised machine learning technique that was developed by Breiman over a decade ago. Compared with other techniques, it has proved its superiority. Many researchers, however, believe there still room for enhancing and improving performance accuracy. This explains why, the past decade, have been many extensions of RF where each extension employed variety techniques strategies to improve certain aspect(s) RF. Since proven empirically ensembles tend yield better results when significant diversity among constituent models, objective this paper twofold. First, investigates how data clustering (a well known technique) can be applied identify groups similar decision trees in order eliminate redundant selecting representative from group (cluster). Second, these likely diverse representatives are then used produce termed CLUB-DRF much smaller size than RF, yet performs at least as good mostly exhibits higher terms The latter refers called pruning. Experimental on 15 real datasets UCI repository prove superiority our proposed traditional Most experiments achieved 92 % or above pruning level while retaining outperforming