ASAP: fast, approximate graph pattern mining at scale

作者: Vladimir Braverman , Anand Padmanabha Iyer , Ion Stoica , Zaoxing Liu , Shivaram Venkataraman

DOI: 10.5555/3291168.3291224

关键词:

摘要: While there has been a tremendous interest in processing data that an underlying graph structure, existing distributed systems take several minutes or even hours to mine simple patterns on graphs. This paper presents ASAP, fast, approximate computation engine for pattern mining. ASAP leverages state-of-the-art results approximation theory, and extends it general settings. To enable the users navigate tradeoff between result accuracy latency, we propose novel approach build Error-Latency Profile (ELP) given computation. We have implemented general-purpose dataflow platform evaluated extensively patterns. Our experimental show outperforms exact mining solutions by up 77×. Further, can scale graphs with billions of edges without need large clusters.

参考文章(37)
Luciana S. Buriol, Gereon Frahling, Stefano Leonardi, Alberto Marchetti-Spaccamela, Christian Sohler, Counting triangles in data streams symposium on principles of database systems. pp. 253- 262 ,(2006) , 10.1145/1142351.1142388
Charalampos E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos, DOULION Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09. pp. 837- 846 ,(2009) , 10.1145/1557019.1557111
Xifeng Yan, Jiawei Han, gSpan: graph-based substructure pattern mining international conference on data mining. pp. 721- 724 ,(2002) , 10.1109/ICDM.2002.1184038
Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, Sambavi Muthukrishnan, One trillion edges Proceedings of the VLDB Endowment. ,vol. 8, pp. 1804- 1815 ,(2015) , 10.14778/2824032.2824077
Sepehr Assadi, Sanjeev Khanna, Yang Li, On estimating maximum matching size in graph streams symposium on discrete algorithms. pp. 1723- 1742 ,(2017) , 10.5555/3039686.3039799
Minlan Yu, Jianshu Chen, Hongqiang Harry Liu, Shivaram Venkataraman, Ming Zhang, Omid Alipourfard, Cherrypick: adaptively unearthing the best cloud configurations for big data analytics networked systems design and implementation. pp. 469- 482 ,(2017)
Ian Robinson, Emil Eifrem, Jim Webber, Graph Databases ,(2013)
Eslam Hussein, Abdurrahman Ghanem, Vinicius Vitor dos Santos Dias, Carlos HC Teixeira, Ghadeer AbuOda, Marco Serafini, Georgos Siganos, Gianmarco De Francisci Morales, Ashraf Aboulnaga, Mohammed Zaki, None, Graph Data Mining with Arabesque international conference on management of data. pp. 1647- 1650 ,(2017) , 10.1145/3035918.3058742
Mingxing Zhang, Yongwei Wu, Youwei Zhuo, Xuehai Qian, Chengying Huan, Kang Chen, Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System architectural support for programming languages and operating systems. ,vol. 53, pp. 608- 621 ,(2018) , 10.1145/3173162.3173208
Santo Fortunato, Community detection in graphs Physics Reports. ,vol. 486, pp. 75- 174 ,(2010) , 10.1016/J.PHYSREP.2009.11.002