作者: Afonso S. Bandeira , Emmanuel Abbe , Georgina Hall
DOI:
关键词:
摘要: The stochastic block model (SBM) with two communities, or equivalently the planted bisection model, is a popular of random graph exhibiting cluster behaviour. In symmetric case, has equally sized clusters and vertices connect probability $p$ within $q$ across clusters. past decades, large body literature in statistics computer science focused on providing lower-bounds scaling $|p-q|$ to ensure exact recovery. this paper, we identify sharp threshold phenomenon for recovery: if $\alpha=pn/\log(n)$ $\beta=qn/\log(n)$ are constant (with $\alpha>\beta$), recovering communities high possible $\frac{\alpha+\beta}{2} - \sqrt{\alpha \beta}>1$ impossible \beta}<1$. particular, improves existing bounds. This also sets new line sight efficient clustering algorithms. While maximum likelihood (ML) achieves optimal (by definition), it worst-case NP-hard. paper proposes an algorithm based semidefinite programming relaxation ML, which proved succeed close threshold, while numerical experiments suggest may achieve threshold. An succeeds all way down obtained using partial recovery combined local improvement procedure.