Discriminative features for identifying and interpreting outliers

作者: Xuan Hong Dang , Ira Assent , Raymond T. Ng , Arthur Zimek , Erich Schubert

DOI: 10.1109/ICDE.2014.6816642

关键词:

摘要: We consider the problem of outlier detection and interpretation. While most existing studies focus on first problem, we simultaneously address equally important challenge propose an algorithm that uncovers outliers in subspaces reduced dimensionality which they are well discriminated from regular objects while at same time retaining natural local structure original data to ensure quality explanation. Our takes a mathematically appealing approach spectral graph embedding theory show it achieves globally optimal solution for objective subspace learning. By using number real-world datasets, demonstrate its performance not only w.r.t. rate but also discriminative human-interpretable features. This is exploit features both interpretation, leading better understanding how why hidden exceptional.

参考文章(40)
Arthur Zimek, Hans-Peter Kriegel, Erich Schubert, Peer Kröger, Interpreting and Unifying Outlier Scores siam international conference on data mining. pp. 13- 24 ,(2011)
Hans-Peter Kriegel, Peer Kröger, Erich Schubert, Arthur Zimek, Outlier Detection in Axis-Parallel Subspaces of High Dimensional Data Advances in Knowledge Discovery and Data Mining. pp. 831- 838 ,(2009) , 10.1007/978-3-642-01307-2_86
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)
Leman Akoglu, Mary McGlohon, Christos Faloutsos, OddBall: spotting anomalies in weighted graphs knowledge discovery and data mining. ,vol. 6119, pp. 410- 421 ,(2010) , 10.1007/978-3-642-13672-6_40
Zengyou He, Shengchun Deng, Xiaofei Xu, A Unified Subspace Outlier Ensemble Framework for Outlier Detection Advances in Web-Age Information Management. pp. 632- 637 ,(2005) , 10.1007/11563952_56
Raymond T. Ng, Edwin M. Knorr, Algorithms for Mining Distance-Based Outliers in Large Datasets very large data bases. pp. 392- 403 ,(1998)
James Franklin, The elements of statistical learning : data mining, inference,and prediction The Mathematical Intelligencer. ,vol. 27, pp. 83- 85 ,(2005) , 10.1007/BF02985802
Hujun Bao, Jiawei Han, Deng Cai, Xiaofei He, Kun Zhou, Locality sensitive discriminant analysis international joint conference on artificial intelligence. pp. 708- 713 ,(2007)
AM Martinez, R Benavente, The AR face database CVC Technical Report24. ,vol. 24, ,(1998)
Gene H. Golub, Charles F. Van Loan, Matrix computations (3rd ed.) Johns Hopkins University Press. ,(1996)