作者: Fedja Hadzic , Michael Hecker , Andrea Tagarelli
DOI: 10.1016/J.INS.2015.03.015
关键词:
摘要: Frequent subtree mining is a major research topic in knowledge discovery from tree-structured data, whose importance witnessed by the pervasiveness of such data several domains. In this paper, we present novel approach to discover all frequent ordered subtrees database. A key aspect that structural aspects input tree instances are extracted generate transactional format enables application standard itemset techniques. way, expensive process enumeration avoided, while can be reconstructed post-processing stage. As result, more structurally complex handled and much lower support thresholds used. addition discovering traditional subtrees, first position-constrained subtrees. Each node annotated with its exact occurrence level embedding original database tree. Also, disconnected associations represented via virtual connecting nodes. Experiments conducted on synthetic real-world datasets confirm expected advantages our over competing methods terms efficiency, capabilities, informativeness patterns.