Massive Social Network Analysis: Mining Twitter for Social Good

作者: David Ediger , Karl Jiang , Jason Riedy , David A Bader , Courtney Corley

DOI: 10.1109/ICPP.2010.66

关键词:

摘要: Social networks produce an enormous quantity of data. Facebook consists over 400 million active users sharing 5 billion pieces information each month. Analyzing this vast unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Toolkit massive graphs representing social network On 128-processor Cray XMT, GraphCT estimates the betweenness centrality artificially generated (R-MAT) 537 vertex, 8.6 edge graph in 55 minutes real-world (Kwak, et al.) with 61.6 vertices 1.47 edges 105 minutes. use to analyze public from Twitter, microblogging network. Twitter's message connections appear primarily tree-structured as news dissemination system. Within data, however, are clusters conversations. Using we can rank actors within these conversations help analysts focus attention on much smaller subset.

参考文章(32)
Deepayan Chakrabarti, Christos Faloutsos, Yiping Zhan, R-MAT: A Recursive Model for Graph Mining siam international conference on data mining. pp. 442- 446 ,(2004)
David A. Bader, Shiva Kintali, Kamesh Madduri, Milena Mihail, Approximating betweenness centrality workshop on algorithms and models for the web graph. pp. 124- 137 ,(2007) , 10.1007/978-3-540-77004-6_10
Jeremy G. Siek, Lie-Quan Lee, Andrew Lumsdaine, The Boost graph library : user guide and reference manual Addison-Wesley. ,(2002)
Jure Leskovec, Ajit Singh, Jon Kleinberg, None, Patterns of influence in a recommendation network knowledge discovery and data mining. pp. 380- 389 ,(2006) , 10.1007/11731139_44
Linton C. Freeman, A Set of Measures of Centrality Based on Betweenness Sociometry. ,vol. 40, pp. 35- 41 ,(1977) , 10.2307/3033543
Douglas Gregor, Andrew Lumsdaine, Lifting sequential graph algorithms for distributed-memory parallel computation Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming systems languages and applications - OOPSLA '05. ,vol. 40, pp. 423- 437 ,(2005) , 10.1145/1094811.1094844
Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos, On power-law relationships of the Internet topology acm special interest group on data communication. ,vol. 29, pp. 251- 262 ,(1999) , 10.1145/316188.316229
R. W. Hamming, Error detecting and error correcting codes Bell System Technical Journal. ,vol. 29, pp. 147- 160 ,(1950) , 10.1002/J.1538-7305.1950.TB00463.X
David Ediger, Karl Jiang, Jason Riedy, David A. Bader, Massive streaming data analytics: A case study with clustering coefficients ieee international symposium on parallel distributed processing workshops and phd forum. pp. 1- 8 ,(2010) , 10.1109/IPDPSW.2010.5470687
Fredrik Liljeros, Yvonne Åberg, Luís A. Nunes Amaral, Christofer R. Edling, H. Eugene Stanley, The web of human sexual contacts Nature. ,vol. 411, pp. 907- 908 ,(2001) , 10.1038/35082140