作者: Zhenhui Li , Jae-Gil Lee , Xiaolei Li , Jiawei Han
DOI: 10.1007/978-3-642-12098-5_3
关键词: Clustering high-dimensional data 、 Data stream clustering 、 Correlation clustering 、 Computer science 、 CURE data clustering algorithm 、 Fuzzy clustering 、 Constrained clustering 、 Cluster analysis 、 Data mining 、 Canopy clustering algorithm
摘要: Trajectory clustering has played a crucial role in data analysis since it reveals underlying trends of moving objects. Due to their sequential nature, trajectory are often received incrementally, e.g., continuous new points reported by GPS system. However, existing algorithms developed for static datasets, they not suitable incremental with the following two requirements. First, should be processed efficiently can frequently requested. Second, huge amounts must accommodated, as will accumulate constantly. An framework trajectories is proposed this paper. It contains parts: online micro-cluster maintenance and offline macro-cluster creation. For part, when bunch arrives, each simplified into set directed line segments order find clusters subparts. Micro-clusters used store compact summaries similar segments, which take much smaller space than raw trajectories. When added, micro-clusters updated incrementally reflect changes. user requests see current result, macro-clustering performed on rather all over whole time span. Since number that original trajectories, macro-clusters generated show result Experimental results both synthetic real sets our achieves high efficiency well quality.