作者: Lei Duan , Guanting Tang , Jian Pei , James Bailey , Akiko Campbell
DOI: 10.1007/S10618-014-0398-2
关键词: Rank (computer programming) 、 Artificial intelligence 、 Subspace topology 、 Anomaly detection 、 Pattern recognition 、 Data mining 、 Data set 、 Object (computer science) 、 Mathematics 、 Outlier 、 Synthetic data 、 Set (abstract data type)
摘要: When we are investigating an object in a data set, which itself may or not be outlier, can identify unusual (i.e., outlying) aspects of the object? In this paper, novel problem mining outlying on numeric data. Given query $$o$$o multidimensional set $$O$$O, subspace is most outlying? Technically, use rank probability density to measure outlyingness subspace. A minimal where ranked best aspect. Computing far from trivial. naive method has calculate densities all objects and them every subspace, very costly when dimensionality high. We systematically develop heuristic that capable searching sets with tens dimensions efficiently. Our empirical study using both real synthetic demonstrates our effective efficient.