Bring Order into the Samples: A Novel Scalable Method for Influence Maximization

作者: Xiaoyang Wang , Ying Zhang , Wenjie Zhang , Xuemin Lin , Chen Chen

DOI: 10.1109/TKDE.2016.2624734

关键词:

摘要: As a key problem in viral marketing, influence maximization has been extensively studied the literature. Given positive integer $k$ , social network $\mathcal {G}$ and certain propagation model, it aims to find set of nodes that have largest spread. The state-of-the-art method IMM is based on reverse sampling (RIS) framework. By using martingale technique, greatly outperforms previous methods efficiency. However, still limitations scalability due high overhead deciding tight sample size. In this paper, instead spending effort size, we present novel bottom- k sketch RIS framework, namely BKRIS, which brings order samples into applying can derive early termination conditions significantly accelerate seed selection procedure. Moreover, provide cost-effective proper size bound quality returned result. addition, several optimization techniques reduce cost generating samples’ efficiently deal with worst-case scenario. We demonstrate efficiency effectiveness proposed over 10 real world datasets. Compared approach, BKRIS achieve up two orders magnitude speedup almost same dataset 1.8 billion edges, return 50 seeds 1.3 seconds 5,000 36.6 seconds. It takes 55.32 second 3,664.97 seconds, respectively.

参考文章(29)
G. L. Nemhauser, L. A. Wolsey, M. L. Fisher, An analysis of approximations for maximizing submodular set functions--I Mathematical Programming. ,vol. 14, pp. 265- 294 ,(1978) , 10.1007/BF01588971
C. A. R. Hoare, Algorithm 65: find Communications of the ACM. ,vol. 4, pp. 321- 322 ,(1961) , 10.1145/366622.366647
Cigdem Aslay, Wei Lu, Francesco Bonchi, Amit Goyal, Laks V. S. Lakshmanan, Viral marketing meets social advertising Proceedings of the VLDB Endowment. ,vol. 8, pp. 814- 825 ,(2015) , 10.14778/2752939.2752950
Edith Cohen, Haim Kaplan, Summarizing data using bottom-k sketches principles of distributed computing. pp. 225- 234 ,(2007) , 10.1145/1281100.1281133
Wei Chen, Chi Wang, Yajun Wang, Scalable influence maximization for prevalent viral marketing in large-scale social networks knowledge discovery and data mining. pp. 1029- 1038 ,(2010) , 10.1145/1835804.1835934
Amit Goyal, Wei Lu, Laks VS Lakshmanan, None, SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model international conference on data mining. pp. 211- 220 ,(2011) , 10.1109/ICDM.2011.132
Kevin Beyer, Peter J. Haas, Berthold Reinwald, Yannis Sismanis, Rainer Gemulla, On synopses for distinct-value estimation under multiset operations Proceedings of the 2007 ACM SIGMOD international conference on Management of data - SIGMOD '07. pp. 199- 210 ,(2007) , 10.1145/1247480.1247504
Youze Tang, Yanchen Shi, Xiaokui Xiao, Influence Maximization in Near-Linear Time: A Martingale Approach international conference on management of data. pp. 1539- 1554 ,(2015) , 10.1145/2723372.2723734
Pedro Domingos, Matt Richardson, Mining the network value of customers knowledge discovery and data mining. pp. 57- 66 ,(2001) , 10.1145/502512.502525
Matthew Richardson, Pedro Domingos, Mining knowledge-sharing sites for viral marketing Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02. pp. 61- 70 ,(2002) , 10.1145/775047.775057