作者: Josh R , Eman M.
DOI: 10.14569/IJACSA.2011.021115
关键词: Artificial intelligence 、 Algorithm 、 CURE data clustering algorithm 、 Knowledge extraction 、 Machine learning 、 Data mining 、 Computer science 、 Data stream mining 、 Clustering high-dimensional data 、 Cluster analysis 、 Data stream clustering 、 Data stream
摘要: The clustering or partitioning of a dataset’s records into groups similar is an important aspect knowledge discovery from datasets. A considerable amount research has been applied to the identification clusters in very large multi-dimensional and static However, traditional and/or pattern recognition algorithms that have resulted this are inefficient for data streams. stream dynamic dataset characterized by sequence evolves over time, extremely fast arrival rates unbounded. Today, world abounds with processes generate high-speed evolving Examples include click streams, credit card transactions sensor networks. stream’s inherent characteristics present interesting set time space related challenges algorithms. In particular, processing severely constrained must be performed single pass incoming data. This paper presents both framework algorithm that, combined, address these allows end-users explore gain Our approach includes integration open source products used control facilitate harnessing stream. Experimental results testing various streams also discussed.