作者: Weizhong Zhao , V. Martha , Xiaowei Xu
DOI: 10.1109/AINA.2013.47
关键词:
摘要: Big data such as complex networks with over millions of vertices and edges is infeasible to process using conventional computation. MapReduce a programming model that empowers us analyze big in cluster computers. In this paper we propose Parallel Structural Clustering Algorithm for Networks (PSCAN) the detection clusters or community structures Twitter. PSCAN based on structural clustering algorithm SCAN, which not only finds accurately, but also identifies playing special roles hubs outliers. An empirical evaluation both real synthetic demonstrated an outstanding performance terms accuracy running time. We analyzed Twitter network 40 million users 1.4 billion follower/following relationships by Hadoop 15 The result shows successfully detected interesting communities people who share common interests.