Graph clustering based on optimization of a macroscopic structure of clusters

作者: Yuta Taniguchi , Daisuke Ikeda

DOI: 10.1007/978-3-642-24477-3_27

关键词:

摘要: A graph is a flexible data structure for various data, such as the Web, SNSs and molecular architectures. Not only expressed naturally by graph, it also used which does not have explicit structures extracting implicit relationships hidden in e.g. co-occurrence of words text similarity pixels an image. By extraction, we can make full use many sophisticated methods graphs to solve wide range problems. In analysis graphs, clustering problem one most important problems, divide all vertices given into some groups called clusters. Existing algorithms typically assume that number intra-cluster edges large while inter-cluster absolutely small. Therefore these fail do case noisy extraction tends yield ones because subject definition relation among vertices. Instead assumption, introduce macroscopic (MS), clusters roughly describes graph. This paper presents algorithm which, clusters, tries find set distance between MS induced from calculated ideal minimized. other words, solves optimization problem. For m-clustering problem, defined m-vertex each vertex has self-loop. To confirm performance improvements exhaustively, conducted experiments with artificial different amounts noise. The results show our method handle very correctly existing completely failed clustering. Furthermore, even less noise, treats them well if difference edge densities those are sufficiently big. We did on transformed vector more practical case. From found algorithm, indeed, works much better than ones.

参考文章(21)
Tamás Nepusz, Gábor Csárdi, The igraph software package for complex network research InterJournal Complex Systems. ,vol. 1695, ,(2006)
Feifeng Zheng, Stanley P. Y. Fung, Wun-Tat Chan, Francis Y. L. Chin, Chung Keung Poon, Prudence W. H. Wong, Improved on-line broadcast scheduling with deadlines computing and combinatorics conference. pp. 320- 329 ,(2006) , 10.1007/11809678_34
Joydeep Ghosh, Raymond Mooney, Alexander Strehl, Impact of Similarity Measures on Web-page Clustering ,(2000)
Dominic Widdows, Beate Dorow, Elisha Moses, Jean-Pierre Eckmann, Danilo Sergi, Katarina Ling, Using Curvature and Markov Clustering in Graphs for Lexical Acquisition and Word Sense Discrimination arXiv: Other Condensed Matter. ,(2004)
Ralitsa Angelova, Gerhard Weikum, Graph-based text classification Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06. pp. 485- 492 ,(2006) , 10.1145/1148170.1148254
M. Girvan, M. E. J. Newman, Community structure in social and biological networks Proceedings of the National Academy of Sciences of the United States of America. ,vol. 99, pp. 7821- 7826 ,(2002) , 10.1073/PNAS.122653799
Jean-François Rual, Kavitha Venkatesan, Tong Hao, Tomoko Hirozane-Kishikawa, Amélie Dricot, Ning Li, Gabriel F. Berriz, Francis D. Gibbons, Matija Dreze, Nono Ayivi-Guedehoussou, Niels Klitgord, Christophe Simon, Mike Boxem, Stuart Milstein, Jennifer Rosenberg, Debra S. Goldberg, Lan V. Zhang, Sharyl L. Wong, Giovanni Franklin, Siming Li, Joanna S. Albala, Janghoo Lim, Carlene Fraughton, Estelle Llamosas, Sebiha Cevik, Camille Bex, Philippe Lamesch, Robert S. Sikorski, Jean Vandenhaute, Huda Y. Zoghbi, Alex Smolyar, Stephanie Bosak, Reynaldo Sequerra, Lynn Doucette-Stamm, Michael E. Cusick, David E. Hill, Frederick P. Roth, Marc Vidal, Towards a proteome-scale map of the human protein–protein interaction network Nature. ,vol. 437, pp. 1173- 1178 ,(2005) , 10.1038/NATURE04209
Günther J.L. Gerhardt, Ney Lemke, Gilberto Corso, Network clustering coefficient approach to DNA sequence analysis Chaos, Solitons & Fractals. ,vol. 28, pp. 1037- 1045 ,(2006) , 10.1016/J.CHAOS.2005.08.138
R. A. FISHER, THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS Annals of Human Genetics. ,vol. 7, pp. 179- 188 ,(1936) , 10.1111/J.1469-1809.1936.TB02137.X
Shantanu Dutt, Wenyong Deng, Cluster-aware iterative improvement techniques for partitioning large VLSI circuits ACM Transactions on Design Automation of Electronic Systems. ,vol. 7, pp. 91- 121 ,(2002) , 10.1145/504914.504918