Information Theoretic Hierarchical Clustering

作者: Mehdi Aghagolzadeh , Hamid Soltanian-Zadeh , Babak Nadjar Araabi

DOI: 10.3390/E13020450

关键词:

摘要: Hierarchical clustering has been extensively used in practice, where clusters can be assigned and analyzed simultaneously, especially when estimating the number of is challenging. However, due to conventional proximity measures recruited these algorithms, they are only capable detecting mass-shape encounter problems identifying complex data structures. Here, we introduce two bottom-up hierarchical approaches that exploit an information theoretic measure explore nonlinear boundaries between extract structures further than second order statistics. Experimental results on both artificial real datasets demonstrate superiority proposed algorithm compared algorithms reported literature, true clusters.

参考文章(28)
Andrew McGregor, Kamalika Chaudhuri, Finding Metric Structure in Information Theoretic Clustering. conference on learning theory. pp. 391- 402 ,(2008)
P. C. Mahalanobis, On the generalized distance in statistics Proceedings of the National Institute of Sciences (Calcutta). ,vol. 2, pp. 49- 55 ,(1936)
Robert Jenssen, Deniz Erdogmus, Kenneth E. Hild, Jose C. Principe, Torbjørn Eltoft, Information Force Clustering Using Directed Trees energy minimization methods in computer vision and pattern recognition. pp. 68- 82 ,(2003) , 10.1007/978-3-540-45063-4_5
Deniz Erdogmus, Jose C. Principe, Information Theoretic Learning Information Theoretic Learning by Jose C. Principe. pp. 902- 909 ,(2010) , 10.4018/978-1-59904-849-9.CH133
Alexander Kraskov, Peter Grassberger, MIC: Mutual Information Based Hierarchical Clustering information theory and statistical learning. pp. 101- 123 ,(2009) , 10.1007/978-0-387-84816-7_5
Catherine A Sugar, Gareth M James, Finding the Number of Clusters in a Dataset Journal of the American Statistical Association. ,vol. 98, pp. 750- 763 ,(2003) , 10.1198/016214503000000666
J. A. Hartigan, M. A. Wong, A K-Means Clustering Algorithm Journal of The Royal Statistical Society Series C-applied Statistics. ,vol. 28, pp. 100- 108 ,(1979) , 10.2307/2346830
James C. Bezdek, Robert Ehrlich, William Full, FCM: The fuzzy c-means clustering algorithm Computers & Geosciences. ,vol. 10, pp. 191- 203 ,(1984) , 10.1016/0098-3004(84)90020-7
CE Shennon, Warren Weaver, A mathematical theory of communication Bell System Technical Journal. ,vol. 27, pp. 379- 423 ,(1948) , 10.1002/J.1538-7305.1948.TB01338.X