A Clustering Method Based on the Maximum Entropy Principle

作者： Edwin Aldana-Bobadilla , Angel Kuri-Morales

关键词:

摘要: Clustering is an unsupervised process to determine which unlabeled objects in a set share interesting properties. The are grouped into k subsets (clusters) whose elements optimize proximity measure. Methods based on information theory have proven be feasible alternatives. They the assumption that cluster one subset with minimal possible degree of "disorder". attempt minimize entropy each cluster. We propose clustering method maximum principle. Such explores space all probability distributions data find maximizes subject extra conditions prior about clusters. "similar" other accordance some statistical As consequence such principle, those high satisfy favored over others. Searching optimal distribution object clusters represents hard combinatorial problem, disallows use traditional optimization techniques. Genetic algorithms good alternative solve this problem. benchmark our relative best theoretical performance, given by Bayes classifier when normally distributed, and multilayer perceptron network, offers practical performance not normal. In general, supervised classification will outperform non-supervised one, since, first case, classes known priori. what follows, we show method's effectiveness comparable one. This clearly exhibits superiority method.

core.ac.uk UNKNOWN 下载加速

sci-hub.se PDF 下载加速

参考文章(73)

Mn Li, Man Wai Mak, Chi Kwong Li, Determining the Optimal Number of Clusters by an Extended RPCL Algorithm Journal of Advanced Computational Intelligence and Intelligent Informatics. ,vol. 3, pp. 467- 473 ,(1999) , 10.20965/JACIII.1999.P0467

Mingjin Yan, Methods of Determining the Number of Clusters in a Data Set and a New Clustering Criterion Virginia Tech. ,(2005)

Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)

Dong-Jo Park, Yong-Woon Park, Do-Jong Kim, A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE Transactions on Information and Systems. ,vol. 84, pp. 281- 285 ,(2001)

P. C. Mahalanobis, On the generalized distance in statistics Proceedings of the National Institute of Sciences (Calcutta). ,vol. 2, pp. 49- 55 ,(1936)

James Franklin, The elements of statistical learning : data mining, inference,and prediction The Mathematical Intelligencer. ,vol. 27, pp. 83- 85 ,(2005) , 10.1007/BF02985802

Hans-Georg Beyer, Hans-Paul Schwefel, Evolution strategies –A comprehensive introduction Natural Computing. ,vol. 1, pp. 3- 52 ,(2002) , 10.1023/A:1015059928466

Kalyanmoy Deb, Deb Kalyanmoy, Multi-Objective Optimization Using Evolutionary Algorithms ,(2001)

James Lee Johnson, Probability and Statistics for Computer Science ,(2003)

10.

John H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence Ann Arbor: University of Michigan Press. ,(1992) , 10.7551/MITPRESS/1090.001.0001

A Clustering Method Based on the Maximum Entropy Principle

来源期刊

我的账户

A Clustering Method Based on the Maximum Entropy Principle

来源期刊

相似文章 10

我的账户