Detecting Overlapping Communities from Local Spectral Subspaces

作者: David Bindel , John E Hopcroft , Yixuan Li , Kun He , Yiwei Sun

DOI:

关键词: Random seedLinear subspaceMathematicsInvariant subspaceIndicator vectorEmbeddingData miningConnected componentArtificial intelligencePattern recognitionVertex (graph theory)Power iteration

摘要: Based on the definition of local spectral subspace, we propose a novel approach called LOSP for overlapping community detection. Instead using invariant subspace spanned by dominant eigenvectors entire network, run power method few steps to approximate leading that depict embedding neighborhood structure around seeds interest. We then seek sparse indicator vector in these vectors such are its support. We evaluate five large real world networks across various domains with labeled ground-truth communities and compare results state-of-the-art detection approaches. identifies members target high accuracy from very seed members, outperforms Heat Kernel or PageRank diffusions as well global baselines. Two candidate definitions analyzed, different scoring functions determining boundary, including two new metrics, thoroughly evaluated. The structural properties sets impact set size discussed. observe low degree behave better, is robust even when started single random seed. Using subroutine starting each ego connected component, try harder yet significant task identifying all vertex in. Experiments show proposed achieves F1 measures detected multiple containing vertex.

参考文章(28)
Symeon Papadopoulos, Yiannis Kompatsiaris, Athena Vakali, Ploutarchos Spyridonos, Community detection in Social Media Data Mining and Knowledge Discovery. ,vol. 24, pp. 515- 554 ,(2012) , 10.1007/S10618-011-0224-Z
Santo Fortunato, Claudio Castellano, Community Structure in Graphs arXiv: Physics and Society. ,(2007)
U. Kang, Christos Faloutsos, Beyond 'Caveman Communities': Hubs and Spokes for Graph Compression and Mining international conference on data mining. pp. 300- 309 ,(2011) , 10.1109/ICDM.2011.26
Andrea Lancichinetti, Filippo Radicchi, José J. Ramasco, Santo Fortunato, Finding Statistically Significant Communities in Networks PLoS ONE. ,vol. 6, pp. e18961- ,(2011) , 10.1371/JOURNAL.PONE.0018961
Isabel M. Kloumann, Jon M. Kleinberg, Community membership identification from small seed sets knowledge discovery and data mining. pp. 1366- 1375 ,(2014) , 10.1145/2623330.2623621
Kyle Kloster, David F. Gleich, Heat kernel based community detection knowledge discovery and data mining. pp. 1386- 1395 ,(2014) , 10.1145/2623330.2623706
Jaewon Yang, Jure Leskovec, Defining and evaluating network communities based on ground-truth Knowledge and Information Systems. ,vol. 42, pp. 181- 213 ,(2015) , 10.1007/S10115-013-0693-Z
Lei Pan, Chao Dai, Chongjun Wang, Junyuan Xie, Meilin Liu, Overlapping Community Detection via Leader-Based Local Expansion in Social Networks international conference on tools with artificial intelligence. ,vol. 1, pp. 397- 404 ,(2012) , 10.1109/ICTAI.2012.61
Daniel A. Spielman, Shang-Hua Teng, Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems symposium on the theory of computing. pp. 81- 90 ,(2004) , 10.1145/1007352.1007372
Joyce Jiyoung Whang, David F. Gleich, Inderjit S. Dhillon, Overlapping community detection using seed set expansion conference on information and knowledge management. pp. 2099- 2108 ,(2013) , 10.1145/2505515.2505535