Scalable mining of large disk-based graph databases

作者： Chen Wang , Wei Wang , Jian Pei , Yongtai Zhu , Baile Shi

关键词: Adjacency list 、 Molecule mining 、 Graph 、 Linear subspace 、 Computer science 、 Graph database 、 Graph (abstract data type) 、 Scalability 、 Data mining

摘要: Mining frequent structural patterns from graph databases is an interesting problem with broad applications. Most of the previous studies focus on pruning unfruitful search subspaces effectively, but few them address mining large, disk-based databases. As many in applications cannot be held into main memory, scalable remains a challenging problem. In this paper, we develop effective index structure, ADI (for adjacency index), to support various over large that memory. The simple and efficient build. Moreover, new structure can easily adopted existing pattern algorithms. example, adapt well-known gSpan algorithm by using structure. experimental results show enables one set experiments, method mine million graphs, while original only handle up 300 thousand graphs. our faster than when both run

参考文章(14)

Lawrence B. Holder, Diane J. Cook, Surnjani Djoko, Substructure discovery in the SUBDUE system knowledge discovery and data mining. pp. 169- 180 ,(1994)

TH Cormen, RL Rivest, CE Leiserson, C Stein, Introduction to Algorithms, 2nd edition. ,(2001)

Akihiro Inokuchi, Takashi Washio, Hiroshi Motoda, An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data european conference on principles of data mining and knowledge discovery. pp. 13- 23 ,(2000) , 10.1007/3-540-45372-5_2

Denis M. Bayada, Richard W. Simpson, A. Peter Johnson, Claude Laurenco, An algorithm for the multiple common subgraph problem Journal of Chemical Information and Computer Sciences. ,vol. 32, pp. 680- 685 ,(1992) , 10.1021/CI00010A015

Mohammed J. Zaki, Efficiently mining frequent trees in a forest Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02. pp. 71- 80 ,(2002) , 10.1145/775047.775058

Yoshimasa TAKAHASHI, Yuzuru SATOH, Hajime SUZUKI, Shin-ichi SASAKI, Recognition of Largest Common Structural Fragment among a Variety of Chemical Structures Analytical Sciences. ,vol. 3, pp. 23- 28 ,(1987) , 10.2116/ANALSCI.3.23

Xifeng Yan, Jiawei Han, CloseGraph: mining closed frequent graph patterns knowledge discovery and data mining. pp. 286- 295 ,(2003) , 10.1145/956750.956784

Xifeng Yan, Philip S. Yu, Jiawei Han, Graph indexing: a frequent structure-based approach international conference on management of data. pp. 335- 346 ,(2004) , 10.1145/1007568.1007607

C. Borgelt, M.R. Berthold, Mining molecular fragments: finding relevant substructures of molecules international conference on data mining. pp. 51- 58 ,(2002) , 10.1109/ICDM.2002.1183885

10.

M. Kuramochi, G. Karypis, Frequent subgraph discovery international conference on data mining. pp. 313- 320 ,(2001) , 10.1109/ICDM.2001.989534

Scalable mining of large disk-based graph databases

来源期刊

我的账户

Scalable mining of large disk-based graph databases

来源期刊

相似文章 10

我的账户