Ordered subtree mining via transactional mapping using a structure-preserving tree database schema

作者: Fedja Hadzic , Michael Hecker , Andrea Tagarelli

DOI: 10.1016/J.INS.2015.03.015

关键词:

摘要: Frequent subtree mining is a major research topic in knowledge discovery from tree-structured data, whose importance witnessed by the pervasiveness of such data several domains. In this paper, we present novel approach to discover all frequent ordered subtrees database. A key aspect that structural aspects input tree instances are extracted generate transactional format enables application standard itemset techniques. way, expensive process enumeration avoided, while can be reconstructed post-processing stage. As result, more structurally complex handled and much lower support thresholds used. addition discovering traditional subtrees, first position-constrained subtrees. Each node annotated with its exact occurrence level embedding original database tree. Also, disconnected associations represented via virtual connecting nodes. Experiments conducted on synthetic real-world datasets confirm expected advantages our over competing methods terms efficiency, capabilities, informativeness patterns.

参考文章(42)
Siegfried Nijssen, Joost Kok, Efficient discovery of frequent unordered trees First international workshop on mining graphs, trees and sequences. ,(2003)
Fedja Hadzic, Henry Tan, Tharam S. Dillon, Mining of Data with Complex Structures Springer. ,vol. 333, pp. 1- 326 ,(2011) , 10.1007/978-3-642-17557-2
Fedja Hadzic, A Structure Preserving Flat Data Format Representation for Tree-Structured Data New Frontiers in Applied Data Mining. pp. 221- 233 ,(2012) , 10.1007/978-3-642-28320-8_19
Mohammed J. Zaki, Efficiently Mining Frequent Embedded Unordered Trees Fundamenta Informaticae. ,vol. 66, pp. 33- 52 ,(2004) , 10.5555/1227174.1227177
Fedja Hadzic, Henry Tan, Tharam S. Dillon, Model guided algorithm for mining unordered embedded subtrees Web Intelligence and Agent Systems: An International Journal. ,vol. 8, pp. 413- 430 ,(2010) , 10.3233/WIA-2010-0200
Kenji Abe, Shinji Kawasoe, Tatsuya Asai, Hiroki Arimura, Setsuo Arikawa, Optimized Substructure Discovery for Semi-structured Data european conference on principles of data mining and knowledge discovery. ,vol. 206, pp. 1- 14 ,(2002) , 10.1007/3-540-45681-3_1
Chen Wang, Mingsheng Hong, Jian Pei, Haofeng Zhou, Wei Wang, Baile Shi, Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining Advances in Knowledge Discovery and Data Mining. pp. 441- 451 ,(2004) , 10.1007/978-3-540-24775-3_54
Tatsuya Asai, Hiroki Arimura, Takeaki Uno, Shin-ichi Nakano, Discovering frequent substructures in large unordered trees discovery science. ,vol. 2843, pp. 47- 61 ,(2003) , 10.1007/978-3-540-39644-4_6
A. Termier, M.-C. Rousset, M. Sebag, Dryade: a new approach for discovering closed frequent trees in heterogeneous tree databases international conference on data mining. pp. 543- 546 ,(2004) , 10.1109/ICDM.2004.10078