MIRAGE: An Iterative MapReduce based FrequentSubgraph Mining Algorithm

作者: Mansurul Bhuiyan , Mohammad Al Hasan

DOI:

关键词: SoftwareTheoretical computer scienceComputationExploratory data analysisSource codeComputer scienceGraph (abstract data type)Data miningData mining algorithmGraphData structure

摘要: Frequent subgraph mining (FSM) is an important task for exploratory data analysis on graph data. Over the years, many algorithms have been proposed to solve this task. These assume that structure of small enough fit in main memory a computer. However, as real-world grows, both size and quantity, such assumption does not hold any longer. To overcome this, some database-centric methods recent years solving FSM; however, distributed solution using MapReduce paradigm has explored extensively. Since, becoming de- facto computation massive data, efficient FSM algorithm huge demand. In work, we propose frequent called MIRAGE which uses iterative based framework. complete it returns all subgraphs given user-defined support, applies optimizations latest adopt. Our experiments with real life large synthetic datasets validate effectiveness from datasets. The source code available www.cs.iupui.edu/alhasan/software/

参考文章(22)
Ramakrishnan Srikant, Rakesh Agrawal, Fast Algorithms for Mining Association Rules in Large Databases very large data bases. pp. 487- 499 ,(1994)
Sharma Chakravarthy, Subhesh Pradhan, DB-FSG: An SQL-Based Approach for Frequent Subgraph Mining database and expert systems applications. pp. 684- 692 ,(2008) , 10.1007/978-3-540-85654-2_59
Akihiro Inokuchi, Takashi Washio, Hiroshi Motoda, An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data european conference on principles of data mining and knowledge discovery. pp. 13- 23 ,(2000) , 10.1007/3-540-45372-5_2
Sharma Chakravarthy, Ramji Beera, Ramanathan Balachandran, DB-Subdue: Database Approach to Graph Mining Advances in Knowledge Discovery and Data Mining. pp. 341- 350 ,(2004) , 10.1007/978-3-540-24775-3_42
Siegfried Nijssen, Joost N. Kok, A quickstart in frequent structure mining can make a difference knowledge discovery and data mining. pp. 647- 652 ,(2004) , 10.1145/1014052.1014134
Steven Hill, Bismita Srichandan, Rajshekhar Sunderraman, An iterative MapReduce approach to frequent subgraph mining in biological datasets Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine - BCB '12. pp. 661- 666 ,(2012) , 10.1145/2382936.2383055
Stefan Kramer, Luc De Raedt, Christoph Helma, Molecular feature mining in HIV data knowledge discovery and data mining. pp. 136- 143 ,(2001) , 10.1145/502512.502533
Wang Lam, Lu Liu, Sts Prasad, Anand Rajaraman, Zoheb Vacheri, AnHai Doan, Muppet Proceedings of the VLDB Endowment. ,vol. 5, pp. 1814- 1825 ,(2012) , 10.14778/2367502.2367520
Guojun Liu, Ming Zhang, Fei Yan, Large-Scale Social Network Analysis Based on MapReduce computational aspects of social networks. pp. 487- 490 ,(2010) , 10.1109/CASON.2010.115
Diane J. Cook, Lawrence B. Holder, Gehad Galal, Ron Maglothin, Approaches to Parallel Graph-Based Knowledge Discovery Journal of Parallel and Distributed Computing. ,vol. 61, pp. 427- 446 ,(2001) , 10.1006/JPDC.2000.1696