作者: Andreas Ziegler , Inke R. König
DOI: 10.1002/WIDM.1114
关键词:
摘要: Random Forests are fast, flexible, and represent a robust approach to mining high-dimensional data. They an extension of classification regression trees (CART). perform well even in the presence large number features small observations. In analogy CART, random forests can deal with continuous outcome, categorical time-to-event outcome censoring. The tree-building process implicitly allows for interaction between high correlation features. Approaches available measuring variable importance reducing Although many applications, their theoretical properties not fully understood. Recently, several articles have provided better understanding forests, we summarize these findings. We survey different versions including classification, probability estimation, estimating survival discuss consequences (1) no selection, (2) (3) combination deterministic selection forests. Finally, review backward elimination forward procedure, determination representing forest, identification important variables forest. provide brief overview areas application WIREs Data Mining Knowl Discov 2014, 4:55–63. doi: 10.1002/widm.1114 For further resources related this article, please visit website.