A Clustering Method Based on the Maximum Entropy Principle

作者: Edwin Aldana-Bobadilla , Angel Kuri-Morales

DOI: 10.3390/E17010151

关键词:

摘要: Clustering is an unsupervised process to determine which unlabeled objects in a set share interesting properties. The are grouped into k subsets (clusters) whose elements optimize proximity measure. Methods based on information theory have proven be feasible alternatives. They the assumption that cluster one subset with minimal possible degree of "disorder". attempt minimize entropy each cluster. We propose clustering method maximum principle. Such explores space all probability distributions data find maximizes subject extra conditions prior about clusters. "similar" other accordance some statistical As consequence such principle, those high satisfy favored over others. Searching optimal distribution object clusters represents hard combinatorial problem, disallows use traditional optimization techniques. Genetic algorithms good alternative solve this problem. benchmark our relative best theoretical performance, given by Bayes classifier when normally distributed, and multilayer perceptron network, offers practical performance not normal. In general, supervised classification will outperform non-supervised one, since, first case, classes known priori. what follows, we show method's effectiveness comparable one. This clearly exhibits superiority method.

参考文章(73)
Mn Li, Man Wai Mak, Chi Kwong Li, Determining the Optimal Number of Clusters by an Extended RPCL Algorithm Journal of Advanced Computational Intelligence and Intelligent Informatics. ,vol. 3, pp. 467- 473 ,(1999) , 10.20965/JACIII.1999.P0467
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)
Dong-Jo Park, Yong-Woon Park, Do-Jong Kim, A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE Transactions on Information and Systems. ,vol. 84, pp. 281- 285 ,(2001)
P. C. Mahalanobis, On the generalized distance in statistics Proceedings of the National Institute of Sciences (Calcutta). ,vol. 2, pp. 49- 55 ,(1936)
James Franklin, The elements of statistical learning : data mining, inference,and prediction The Mathematical Intelligencer. ,vol. 27, pp. 83- 85 ,(2005) , 10.1007/BF02985802
Hans-Georg Beyer, Hans-Paul Schwefel, Evolution strategies –A comprehensive introduction Natural Computing. ,vol. 1, pp. 3- 52 ,(2002) , 10.1023/A:1015059928466