作者: Robert Tibshirani , Guenther Walther , Trevor Hastie
关键词:
摘要: We propose a method (the ‘gap statistic’) for estimating the number of clusters (groups) in set data. The technique uses output any clustering algorithm (e.g. K-means or hierarchical), comparing change within-cluster dispersion with that expected under an appropriate reference null distribution. Some theory is developed proposal and simulation study shows gap statistic usually outperforms other methods have been proposed literature.