作者: Alexandre Proutiere , Se-Young Yun
DOI:
关键词: Combinatorics 、 Cluster (physics) 、 Stochastic block model 、 Open problem 、 Cluster analysis 、 Mathematics 、 Simple (abstract algebra) 、 Matching (graph theory) 、 Set (abstract data type) 、 Finite set 、 Statistics
摘要: We consider the problem of community detection or clustering in labeled Stochastic Block Model (LSBM) with a finite number $K$ clusters sizes linearly growing global population items $n$. Every pair is independently at random, and label $\ell$ appears probability $p(i,j,\ell)$ between two indexed by $i$ $j$, respectively. The objective to reconstruct from observation these random labels. Clustering under SBM their extensions has attracted much attention recently. Most existing work aimed characterizing set parameters such that it possible infer either positively correlated true clusters, vanishing proportion misclassified items, exactly matching clusters. find there exists algorithm most $s$ average general LSBM for any $s=o(n)$, which solves one open raised \cite{abbe2015community}. further develop an algorithm, based on simple spectral methods, achieves this fundamental performance limit within $O(n \mbox{polylog}(n))$ computations without a-priori knowledge model parameters.