The analysis of a simple k-means clustering algorithm

作者： Tapas Kanungo , David M. Mount , Nathan S. Netanyahu , Christine Piatko , Ruth Silverman

关键词:

摘要: Abstract : K-means clustering is a very popular technique which used in numerous applications. Given set of n data points R(exp d) and an integer k, the problem to determine k d), called centers, so as minimize mean squared distance from each point its nearest center. A heuristic for k-means Lloyd's algorithm. In this paper, we present simple efficient implementation algorithm, call filtering This algorithm easy implement. It differs most other approaches that it precomputes kd-tree structure rather than center points. We establish practical efficiency two ways. First, data-sensitive analysis algorithm's running time. Second, have implemented performed number empirical studies, both on synthetically generated real applications color quantization, compression, segmentation.

参考文章(35)

E. W. Forgy, Cluster analysis of multivariate data : efficiency versus interpretability of classifications Biometrics. ,vol. 21, pp. 768- 769 ,(1965)

Stavros G. Kolliopoulos, Satish Rao, A Nearly Linear-Time Approximation Scheme for the Euclidean kappa-median Problem european symposium on algorithms. pp. 378- 389 ,(1999) , 10.1007/3-540-48481-7_33

M. Inaba, Application of weighted Voronoi diagrams and randomization to variance-based k-clustering symposium on computational geometry. ,(1994)

Allen Gersho, Robert M. Gray, Vector Quantization and Signal Compression ,(1991)

Songrit Maneewongvatana, David M. Mount, Analysis of Approximate Nearest Neighbor Searching with Clustered Point Sets Data Structures, Near Neighbor Searches, and Methodology. pp. 105- 123 ,(1999)

Keinosuke Fukunaga, Introduction to statistical pattern recognition (2nd ed.) Academic Press Professional, Inc.. ,(1990)

Richard C. Dubes, Anil K. Jain, Algorithms for clustering data ,(1988)

Christine Piatko, Tapas Kanungo, Ruth Silverman, Nathan S. Netanyahu, David M. Mount, Angela Y. Wu, Computing nearest neighbors for moving points and applications to clustering symposium on discrete algorithms. pp. 931- 932 ,(1999)

Sanjeev Arora, Prabhakar Raghavan, Satish Rao, Approximation schemes for Euclidean k-medians and related problems symposium on the theory of computing. pp. 106- 113 ,(1998) , 10.1145/276698.276718

10.

Pravin M. Vaidya, AnO(n logn) algorithm for the all-nearest-neighbors Problem Discrete & Computational Geometry. ,vol. 4, pp. 101- 115 ,(1989) , 10.1007/BF02187718

The analysis of a simple k-means clustering algorithm

来源期刊

我的账户

The analysis of a simple k-means clustering algorithm

来源期刊

相似文章 10

我的账户