Mining tree-structured data on multicore systems

作者： Shirish Tatikonda , Srinivasan Parthasarathy

关键词:

摘要: Mining frequent subtrees in a database of rooted and labeled trees is an important problem many domains, ranging from phylogenetic analysis to biochemistry linguistic parsing XML data analysis. In this work we revisit develop architecture conscious solution targeting emerging multicore systems. Specifically identify sequence memory related optimizations that significantly improve the spatial temporal locality state-of-the-art sequential algorithm -- alleviating effects latency. Additionally, these are shown reduce pressure on front-side bus, consideration context large-scale architectures. We then demonstrate while necessary not sufficient for efficient parallelization multicores, primarily due parametric data-driven factors which make load balancing significant challenge. To address challenge, present methodology adaptively automatically modulates type granularity being shared among different cores. The resulting achieves near perfect parallel efficiency up 16 processors challenging real world applications. have general purpose utility key out-come development scheduling service moldable task

参考文章(51)

Siegfried Nijssen, Joost Kok, Efficient discovery of frequent unordered trees First international workshop on mining graphs, trees and sequences. ,(2003)

節夫有川, 比呂志坂本, 真治川副, Setsuo Arikawa, 賢治安部, 達哉浅井, 博紀有村, Shinji Kawasoe, Kenji Abe, Hiroshi Sakamoto, Hiroki Arimura, Tatsuya Asai, Efficient Substructure Discovery from Large Semi-structed Data DOI Technical Report. ,vol. 200, ,(2001)

Srinivasan Parthasarathy, Mitsunori Ogihara, Mohammed J Zaki, Wei Li, New algorithms for fast discovery of association rules knowledge discovery and data mining. pp. 283- 286 ,(1997)

Hiroshi Mamitsuka, Tatsuya Akutsu, Nobuhisa Ueda, Kiyoko F. Aoki, Yasushi Okuno, Minoru Kanehisa, Atsuko Yamaguchi, Efficient tree-matching methods for accurate carbohydrate database queries. Genome Informatics. ,vol. 14, pp. 134- 143 ,(2003) , 10.11234/GI1990.14.134

T. Asai, Efficient substructure discovery from large semi-structured data siam international conference on data mining. pp. 158- 174 ,(2002)

Chen Wang, Mingsheng Hong, Jian Pei, Haofeng Zhou, Wei Wang, Baile Shi, Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining Advances in Knowledge Discovery and Data Mining. pp. 441- 451 ,(2004) , 10.1007/978-3-540-24775-3_54

Pavel Zezula, Giuseppe Amato, Franca Debole, Fausto Rabitti, Tree Signatures for XML Querying and Navigation international xml database symposium. pp. 149- 163 ,(2003) , 10.1007/978-3-540-39429-7_10

James Clifford, Donald J. Berndt, Finding patterns in time series: a dynamic programming approach knowledge discovery and data mining. pp. 229- 248 ,(1996)

A. Termier, M.-C. Rousset, M. Sebag, Dryade: a new approach for discovering closed frequent trees in heterogeneous tree databases international conference on data mining. pp. 543- 546 ,(2004) , 10.1109/ICDM.2004.10078

10.

Yun Chi, Yirong Yang, Yi Xia, Richard R. Muntz, CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees Advances in Knowledge Discovery and Data Mining. pp. 63- 73 ,(2004) , 10.1007/978-3-540-24775-3_9

Mining tree-structured data on multicore systems

来源期刊

我的账户

Mining tree-structured data on multicore systems

来源期刊

相似文章 10

我的账户