Interpretability and Refinement of Clustering

作者: Felix Iglesias Vazquez , Tanja Zseby , Arthur Zimek

DOI: 10.1109/DSAA49011.2020.00014

关键词:

摘要: The difficulty to validate clustering reliability hinders the adoption of in real-life applications. We propose: (a) a set symbolic representations interpret problem spaces and (b) CluReAL algorithm refine any result regardless used technique. Both approaches are grounded by recently published absolute cluster validity indices. Conducted experiments show how refinement improves performances wide variety scenarios builds more interpretable solutions, whereas shown offer explainable summaries contexts. Refinement interpretability both crucial reduce failure increase performance control operational awareness processes that depend on clustering.

参考文章(27)
George Karypis, CLUTO - A Clustering Toolkit Defense Technical Information Center. ,(2002) , 10.21236/ADA439508
Gerik Scheuermann, Christian Heine, Manual clustering refinement using interaction with blobs ieee vgtc conference on visualization. pp. 59- 66 ,(2007) , 10.2312/VISSYM/EUROVIS07/059-066
B. W. Silverman, Using Kernel Density Estimates to Investigate Multimodality Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 43, pp. 97- 99 ,(1981) , 10.1111/J.2517-6161.1981.TB01155.X
Olatz Arbelaitz, Ibai Gurrutxaga, Javier Muguerza, Jesús M. Pérez, Iñigo Perona, An extensive comparative study of cluster validity indices Pattern Recognition. ,vol. 46, pp. 243- 256 ,(2013) , 10.1016/J.PATCOG.2012.07.021
Hans‐Peter Kriegel, Peer Kröger, Jörg Sander, Arthur Zimek, Density‐based clustering Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery. ,vol. 1, pp. 231- 240 ,(2011) , 10.1002/WIDM.30
Peter J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis Journal of Computational and Applied Mathematics. ,vol. 20, pp. 53- 65 ,(1987) , 10.1016/0377-0427(87)90125-7
Fionn Murtagh, Counting dendrograms: A survey Discrete Applied Mathematics. ,vol. 7, pp. 191- 199 ,(1984) , 10.1016/0166-218X(84)90066-0
William H. E. Day, Herbert Edelsbrunner, Efficient algorithms for agglomerative hierarchical clustering methods Journal of Classification. ,vol. 1, pp. 7- 24 ,(1984) , 10.1007/BF01890115
Vikas C. Raykar, Ramani Duraiswami, Linda H. Zhao, Fast Computation of Kernel Estimators Journal of Computational and Graphical Statistics. ,vol. 19, pp. 205- 220 ,(2010) , 10.1198/JCGS.2010.09046
Yanchi Liu, Zhongmou Li, Hui Xiong, Xuedong Gao, Junjie Wu, Sen Wu, Understanding and Enhancement of Internal Clustering Validation Measures IEEE Transactions on Systems, Man, and Cybernetics. ,vol. 43, pp. 982- 994 ,(2013) , 10.1109/TSMCB.2012.2220543