Community membership identification from small seed sets

作者: Isabel M. Kloumann , Jon M. Kleinberg

DOI: 10.1145/2623330.2623621

关键词:

摘要: In many applications we have a social network of people and would like to identify the members an interesting but unlabeled group or community. We start with small number exemplar -- they may be followers political ideology fans music genre need use those examples discover additional members. This problem gives rise seed expansion in community detection: given example members, how can graph used predict identities remaining, hidden members? contrast global detection (graph partitioning covering), is best suited for identifying communities locally concentrated around nodes interest. A growing body work has as scalable means detecting overlapping communities. Yet despite interest expansion, there are divergent approaches literature still isn't systematic understanding which different domains. Here evaluate several variants uncover subtle trade-offs between approaches. explore properties set improve performance, focusing on heuristics that one control practice. As consequence this found opportunities performance gains. also consider adaptive version requests made membership labels particular nodes, such finds field studies leads connections contrasts active learning exploration exploitation. Finally, topological sets correlate algorithm explain these empirical observations theoretical ones. our methods across multiple domains, using publicly available datasets labeled, ground-truth

参考文章(21)
Charles H. Hubbell, An Input-Output Approach to Clique Identification Sociometry. ,vol. 28, pp. 377- ,(1965) , 10.2307/2785990
Rajeev Motwani, Terry Winograd, Lawrence Page, Sergey Brin, The PageRank Citation Ranking : Bringing Order to the Web the web conference. ,vol. 98, pp. 161- 172 ,(1999)
Jon M. Kleinberg, Authoritative sources in a hyperlinked environment symposium on discrete algorithms. pp. 668- 677 ,(1998) , 10.5555/314613.315045
James P Bagrow, Evaluating local community methods in networks Journal of Statistical Mechanics: Theory and Experiment. ,vol. 2008, pp. 05001- ,(2008) , 10.1088/1742-5468/2008/05/P05001
Andrew Mehler, Steven Skiena, Expanding network communities from representative examples ACM Transactions on Knowledge Discovery from Data. ,vol. 3, pp. 1- 27 ,(2009) , 10.1145/1514888.1514890
Leo Katz, A new status index derived from sociometric analysis Psychometrika. ,vol. 18, pp. 39- 43 ,(1953) , 10.1007/BF02289026
Daniel A. Spielman, Shang-Hua Teng, Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems symposium on the theory of computing. pp. 81- 90 ,(2004) , 10.1145/1007352.1007372
Bruno Abrahao, Sucheta Soundarajan, John Hopcroft, Robert Kleinberg, On the separability of structural classes of communities Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '12. pp. 624- 632 ,(2012) , 10.1145/2339530.2339631
Joyce Jiyoung Whang, David F. Gleich, Inderjit S. Dhillon, Overlapping community detection using seed set expansion conference on information and knowledge management. pp. 2099- 2108 ,(2013) , 10.1145/2505515.2505535
Jaewon Yang, Jure Leskovec, Defining and evaluating network communities based on ground-truth Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics - MDS '12. pp. 3- ,(2012) , 10.1145/2350190.2350193