作者: Shyam Boriah , Varun Chandola , Vipin Kumar
DOI: 10.1137/1.9781611972788.22
关键词:
摘要: Measuring similarity or distance between two entities is a key step for several data mining and knowledge discovery tasks. The notion of continuous relatively well-understood, but categorical data, the computation not straightforward. Several data-driven measures have been proposed in literature to compute instances their relative performance has evaluated. In this paper we study variety context specific task: outlier detection. Results on sets show that while no one measure dominates others all types problems, some are able consistently high performance.