Detecting outlying subspaces for high-dimensional data: a heuristic search approach

作者： Ji Zhang

DOI:

关键词:

摘要: [Abstract]: In this paper, we identify a new task for studying the out-lying degree of high-dimensional data, i.e. finding sub-spaces (subset features) in which given points are out-liers, and propose novel detection algorithm, called High-D Outlying subspace Detection (HighDOD). We measure outlying point using sum distances between its k nearest neighbors. Heuristic pruning strategies proposed to realize fast search an efficient dynamic search method with sample-based learning process has been im- plemented. Experimental results show that HighDOD is outperforms other searching alternatives such as naive top-down, bottom-up random methods. Points these sparse subspaces assumed be the outliers. While knowing data the outliers can be useful, many applications, it more important given point outlier, motivates proposal a new technique paper handle task.

usq.edu.au 本地加速

usq.edu.au LINK 下载加速

参考文章(10)

Fabrizio Angiulli, Clara Pizzuti, Fast Outlier Detection in High Dimensional Spaces european conference on principles of data mining and knowledge discovery. pp. 15- 26 ,(2002) , 10.1007/3-540-45681-3_2

Raymond T. Ng, Edwin M. Knorr, Algorithms for Mining Distance-Based Outliers in Large Datasets very large data bases. pp. 392- 403 ,(1998)

Arthur E. Mace, Sample-Size Determination. ,(1964)

Raymond T. Ng, Edwin M. Knorr, Finding Intensional Knowledge of Distance-Based Outliers very large data bases. pp. 211- 222 ,(1999)

Wen Jin, Anthony K. H. Tung, Jiawei Han, Mining top-n local outliers in large databases knowledge discovery and data mining. pp. 293- 298 ,(2001) , 10.1145/502512.502554

S. Papadimitriou, H. Kitagawa, P.B. Gibbons, C. Faloutsos, LOCI: fast outlier detection using the local correlation integral international conference on data engineering. pp. 315- 326 ,(2003) , 10.1109/ICDE.2003.1260802

Sridhar Ramaswamy, Rajeev Rastogi, Kyuseok Shim, Efficient algorithms for mining outliers from large data sets international conference on management of data. ,vol. 29, pp. 427- 438 ,(2000) , 10.1145/335191.335437

Micheline Kamber, Jiawei Han, Jian Pei, Data Mining: Concepts and Techniques ,(2000)

Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jörg Sander, LOF: identifying density-based local outliers international conference on management of data. ,vol. 29, pp. 93- 104 ,(2000) , 10.1145/335191.335388

10.

Stefan Berchtold, Daniel A. Keim, Hans-Peter Kriegel, The X-tree: an index structure for high-dimensional data very large data bases. pp. 451- 462 ,(2001) , 10.1016/B978-155860651-7/50124-8

Detecting outlying subspaces for high-dimensional data: a heuristic search approach

来源期刊

我的账户

Detecting outlying subspaces for high-dimensional data: a heuristic search approach

来源期刊

相似文章 0

我的账户