作者: Jiawei Han , Zhijun Yin , Sangkyum Kim , Liangliang Cao , Xin Jin
DOI:
关键词: Activity detection 、 Large scale data 、 Cluster analysis 、 Exploit 、 Set (abstract data type) 、 Computer science 、 Scale (descriptive set theory) 、 Data mining 、 General activity
摘要: In this paper, we propose GAD (General Activity Detection) for fast clustering on large scale data. Within framework design a set of algorithms different scenarios: (1) Exact algorithm E-GAD, which is much faster than K-Means and gets the same result. (2) Approximate with assumptions, are E-GAD while achieving degrees approximation. (3) based to handle ”large clusters” problem appears in many applications. Two existing activity detection GT CGAUTC special cases under framework. The most important contribution our work that general solution exploit both exact approximate senarios, proposed within can achieve very high speed. Extensive experiments have been conducted several datasets from various real world applications; results show effective efficient.