Information and Entropy in Cluster Analysis

作者: H. H. Bock

DOI: 10.1007/978-94-011-0800-3_4

关键词:

摘要: Cluster analysis provides methods for subdividing a set of objects into suitable number ‘classes’, ‘groups’, or ‘types’ C 1,…,C m such that each class is as homogeneous possible and different classes are sufficiently separated. This paper shows how entropy information measures have been can be used in this framework. We present several probabilistic clustering approaches which related to, lead criteria g(C) selecting an optimum partition = (C ) n data vectors, qualitative quantitative data, assuming loglinear, logistic, normal distribution models, together with appropriate iterative algorithms. A new partitioning problem considered Section 5 where we look dissection (discretization) arbitrary sample space Y (e.g. R p 0,1 the o—divergence I c (P 0, P 1) between two discretized distributions o i ), 1(C (i 1,…, m) will maximized (e.g., Kullback-Leibler’s discrimination X 2 noncentrality parameter). conclude some comments on classes, e.g., by using Akaike’s criterion AIC its modifications.

参考文章(82)
Erhard Godehardt, Graphs as Structural Models Vieweg+Teubner Verlag. ,(1988) , 10.1007/978-3-322-96310-9
H. H. Bock, Probabilistic Aspects in Cluster Analysis Springer, Berlin, Heidelberg. pp. 12- 44 ,(1989) , 10.1007/978-3-642-75040-3_2
D.J. Hand, Cluster dissection and analysis: Helmuth SPATH Wiley, Chichester, 1985, 226 pages, £25.00 European Journal of Operational Research. ,vol. 25, pp. 147- ,(1986) , 10.1016/0377-2217(86)90128-1
G. N. Lance, W. T. Williams, Mixed-Data Classificatory Programs I - Agglomerative Systems. Australian Computer Journal. ,vol. 1, pp. 15- 20 ,(1967)
H. H. Bock, A Clustering Technique for Maximizing φ-Divergence, Noncentrality and Discriminating Power Studies in Classification, Data Analysis, and Knowledge Organization. pp. 19- 36 ,(1992) , 10.1007/978-3-642-46757-8_3
H. Akaike, On entropy maximization principle Applications of Statistics. pp. 27- 41 ,(1977)
Dorothea Eisenblätter, Hamparsum Bozdogan, Two-Stage Multi-Sample Cluster Analysis as a General Approach to Discriminant Analysis Multivariate Statistical Modeling and Data Analysis. pp. 95- 119 ,(1987) , 10.1007/978-94-009-3977-6_6
Ewert Bengtsson, Bengt Dahlqvist, Olle Eriksson, Bo Nordin, Torsten Jarkrans, Björn Stenkvist, Algorithms for Cluster Analysis pp. 134- 139 ,(1983)
W.T. Williams, M.B. Dale, Fundamental Problems in Numerical Taxonomy Advances in Botanical Research. ,vol. 2, pp. 35- 68 ,(1966) , 10.1016/S0065-2296(08)60249-9