作者: Michael J. Laszlo , Jeevan D'Souza
DOI:
关键词: Data mining 、 Mathematics 、 Centroid 、 Crossover 、 Euclidean distance 、 Medoid 、 Genetic algorithm 、 Cluster analysis 、 Population 、 Data point
摘要: Data clustering, which partitions data points into clusters, has many useful applications in economics, science and engineering. clustering algorithms can be partitional or hierarchical. The k-means algorithm is the most widely used because of its simplicity efficiency. One problem with that quality produced highly dependent on initial selection centers. This been tackled using genetic (GA) where a set centers encoded an individual population solutions are generated evolutionary operators such as crossover, mutation selection. Of GA methods, region-based (RBGA) proven to effective technique when centroid was representative object cluster (ROC) Euclidean distance metric. The RBGA uses crossover operator exchanges subsets belong region space rather than exchanging random rationale occupy given tend serve building blocks. Exchanging preserves propagates high-quality partial solutions. This research aims at assessing variety ROCs metrics. tested along other four benchmark datasets metrics, varied number centers, centroids medoids ROCs. results obtained showed superior performance across all sets parameters, indicating may prove strategy broad range problems.