Detection of Outliers in an Unsupervised Environment

作者: M. Ashwini Kumari , M. S. Bhargavi , Sahana D. Gowda

DOI: 10.1007/978-81-322-2208-8_51

关键词:

摘要: Outliers are exceptions when compared with the rest of data. do not have a clear distinction respect to regular samples in dataset. Analysis and knowledge extraction from data outliers lead ambiguity confused conclusions. Therefore, there is need for detection as pre-processing stage mining. In multidimensional perspective, outlier challenging issue an object may deviate one subspace appear perfectly another subspace. this paper, ensemble meta-algorithm has been proposed analyze vote identification subspaces. Cook’s distance, regression based model applied detect voted by meta-algorithm. Extensive experimentation on real datasets demonstrates efficiency system detecting outliers.

参考文章(34)
Peter Filzmoser, None, A MULTIVARIATE OUTLIER DETECTION METHOD ,(2004)
Hoang Vu Nguyen, Emmanuel Müller, Jilles Vreeken, Fabian Keller, Klemens Böhm, CMI: An Information-Theoretic Contrast Measure for Enhancing Subspace Cluster and Outlier Detection siam international conference on data mining. pp. 198- 206 ,(2013) , 10.1137/1.9781611972832.22
P. C. Mahalanobis, On the generalized distance in statistics Proceedings of the National Institute of Sciences (Calcutta). ,vol. 2, pp. 49- 55 ,(1936)
Zengyou He, Shengchun Deng, Xiaofei Xu, A Unified Subspace Outlier Ensemble Framework for Outlier Detection Advances in Web-Age Information Management. pp. 632- 637 ,(2005) , 10.1007/11563952_56
Raymond T. Ng, Edwin M. Knorr, Algorithms for Mining Distance-Based Outliers in Large Datasets very large data bases. pp. 392- 403 ,(1998)
Prabhakar Raghavan, Andreas Arning, Rakesh Agrawal, A linear method for deviation detection in large databases knowledge discovery and data mining. pp. 164- 169 ,(1996)
Hoang Vu Nguyen, Hock Hee Ang, Vivekanand Gopalkrishnan, Mining outliers with ensemble of heterogeneous detectors on random subspaces database systems for advanced applications. pp. 368- 383 ,(2010) , 10.1007/978-3-642-12026-8_29
Simon Hawkins, Hongxing He, Graham Williams, Rohan Baxter, Outlier Detection Using Replicator Neural Networks data warehousing and knowledge discovery. pp. 170- 180 ,(2002) , 10.1007/3-540-46145-0_17
Andrew Foss, Osmar R. Zaïane, Class separation through variance: a new application of outlier detection Knowledge and Information Systems. ,vol. 29, pp. 565- 596 ,(2011) , 10.1007/S10115-010-0347-3
David C. Hoaglin, Roy E. Welsch, The Hat Matrix in Regression and ANOVA The American Statistician. ,vol. 32, pp. 17- 22 ,(1978) , 10.1080/00031305.1978.10479237