Fast Outlier Detection in High Dimensional Spaces

关键词: Data set 、 Point (geometry) 、 Solution set 、 Time complexity 、 k-nearest neighbors algorithm 、 Combinatorics 、 Mathematics 、 Curse of dimensionality 、 Hilbert curve 、 Outlier

摘要: In this paper we propose a new definition of distance-based outlier that considers for each point the sum of the distances from its k nearest neighbors, called weight. Outliers are those points having the largest values of weight. In order to compute these weights, we find the k nearest neighbors of each point in a fast and efficient way by linearizing the search space through the Hilbert space filling curve. The algorithm consists of two phases, the first provides an approximated solution, within a small factor, after executing at most d+ 1 scans …

参考文章(20)

Carla E. Brodley, Mark A. Friedl, Identifying and eliminating mislabeled training instances national conference on artificial intelligence. pp. 799- 805 ,(1996)

Zbigniew R. Struzik, Arno P.J.M. Siebes, Outlier detection and localisation with wavelet based multifractal formalism ,(2000)

Raymond T. Ng, Edwin M. Knorr, Algorithms for Mining Distance-Based Outliers in Large Datasets very large data bases. pp. 392- 403 ,(1998)

Prabhakar Raghavan, Andreas Arning, Rakesh Agrawal, A linear method for deviation detection in large databases knowledge discovery and data mining. pp. 164- 169 ,(1996)

Sunita Sarawagi, Rakesh Agrawal, Nimrod Megiddo, Discovery-Driven Exploration of OLAP Data Cubes extending database technology. pp. 168- 182 ,(1998) , 10.1007/BFB0100984

Swanwa Liao, Mario Alberto López, Finding k-Closest-Pairs Efficiently for High Dimensional Data. canadian conference on computational geometry. ,(2000)

Kenji Yamanishi, Jun-ichi Takeuchi, Discovering outlier filtering rules from unlabeled data: combining a supervised learner with an unsupervised learner knowledge discovery and data mining. pp. 389- 394 ,(2001) , 10.1145/502512.502570

Vic Barnett, Toby Lewis, Outliers in Statistical Data ,(1978)

C. Faloutsos, S. Roseman, Fractals for secondary key retrieval symposium on principles of database systems. pp. 247- 252 ,(1989) , 10.1145/73721.73746

10.

Kenji Yamanishi, Jun-Ichi Takeuchi, Graham Williams, Peter Milne, On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms knowledge discovery and data mining. pp. 320- 324 ,(2000) , 10.1145/347090.347160

Fast Outlier Detection in High Dimensional Spaces

来源期刊

我的账户

Fast Outlier Detection in High Dimensional Spaces

来源期刊

相似文章 10

我的账户