作者: Arthur Zimek , Matthew Gaudet , Ricardo J.G.B. Campello , Jörg Sander
关键词:
摘要: Outlier detection and ensemble learning are well established research directions in data mining yet the application of techniques to outlier has been rarely studied. Here, we propose study subsampling as a technique induce diversity among individual detectors. We show analytically experimentally that an detector based on subsample per se, besides inducing diversity, can, under certain conditions, already improve upon results same complete dataset. Building top several subsamples is further improving results. While literature so far intuition ensembles over single detectors just transferred from classification literature, here also justify why expected work unsupervised area detection. As side effect, running dataset more efficient than other means introducing and, depending sample rate size ensemble, can be even data.