作者: Ninh D. Pham , Quang Loc Le , Tran Khanh Dang
DOI: 10.1007/978-3-642-12145-6_12
关键词: Data mining 、 Cluster analysis 、 Computer science 、 Data cleansing 、 Series (mathematics) 、 Pruning (decision trees) 、 Time series database 、 Gaussian 、 Anomaly detection 、 Representation (mathematics)
摘要: Finding discords in time series database is an important problem the last decade due to its variety of real-world applications, including data cleansing, fault diagnostics, and financial analysis. The best known approach our knowledge HOT SAX technique based on equiprobable distribution representations series. This characteristic, however, not preserved reduced-dimensionality literature, especially lack Gaussian datasets. In this paper, we introduce a k-means algorithm for symbolic called adaptive Symbolic Aggregate approXimation (aSAX) propose aSAX discovery. Due clustered characteristic words, produces greater pruning power than previous approach. Our empirical experiments with datasets confirm theoretical analyses as well efficiency