Exact Recovery in the Stochastic Block Model

作者: Afonso S. Bandeira , Emmanuel Abbe , Georgina Hall

DOI:

关键词:

摘要: The stochastic block model (SBM) with two communities, or equivalently the planted bisection model, is a popular of random graph exhibiting cluster behaviour. In symmetric case, has equally sized clusters and vertices connect probability $p$ within $q$ across clusters. past decades, large body literature in statistics computer science focused on providing lower-bounds scaling $|p-q|$ to ensure exact recovery. this paper, we identify sharp threshold phenomenon for recovery: if $\alpha=pn/\log(n)$ $\beta=qn/\log(n)$ are constant (with $\alpha>\beta$), recovering communities high possible $\frac{\alpha+\beta}{2} - \sqrt{\alpha \beta}>1$ impossible \beta}<1$. particular, improves existing bounds. This also sets new line sight efficient clustering algorithms. While maximum likelihood (ML) achieves optimal (by definition), it worst-case NP-hard. paper proposes an algorithm based semidefinite programming relaxation ML, which proved succeed close threshold, while numerical experiments suggest may achieve threshold. An succeeds all way down obtained using partial recovery combined local improvement procedure.

参考文章(23)
Anne Condon, Richard M. Karp, Algorithms for Graph Partitioning on the Planted Partition Model Randomization, Approximation, and Combinatorial Optimization. Algorithms and Techniques. pp. 221- 232 ,(1999) , 10.1007/978-3-540-48413-4_23
Van Vu, A simple SVD algorithm for finding hidden partitions arXiv: Combinatorics. ,(2014)
F. McSherry, Spectral partitioning of random graphs international conference on cluster computing. pp. 529- 537 ,(2001) , 10.1109/SFCS.2001.959929
Elchanan Mossel, Joe Neeman, Allan Sly, Stochastic Block Models and Reconstruction arXiv: Probability. ,(2012)
Elchanan Mossel, Joe Neeman, Allan Sly, A Proof Of The Block Model Threshold Conjecture arXiv: Probability. ,(2013)
Emmanuel Abbe, Andrea Montanari, Conditional Random Fields, Planted Constraint Satisfaction, and Entropy Concentration arXiv: Probability. ,(2013)
Michel X. Goemans, David P. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming Journal of the ACM. ,vol. 42, pp. 1115- 1145 ,(1995) , 10.1145/227683.227684
Ravi B. Boppana, Eigenvalues and graph bisection: An average-case analysis 28th Annual Symposium on Foundations of Computer Science (sfcs 1987). pp. 280- 285 ,(1987) , 10.1109/SFCS.1987.22
Aurelien Decelle, Florent Krzakala, Cristopher Moore, Lenka Zdeborová, Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Physical Review E. ,vol. 84, pp. 066106- ,(2011) , 10.1103/PHYSREVE.84.066106
M.E Dyer, A.M Frieze, The solution of some random NP-hard problems in polynomial expected time Journal of Algorithms. ,vol. 10, pp. 451- 489 ,(1989) , 10.1016/0196-6774(89)90001-1