Design and implementation of parallel Modified PrefixSpan method

作者: Toshihide Sutou , Keiichi Tamura , Yasuma Mori , Hajime Kitakami

DOI: 10.1007/978-3-540-39707-6_36

关键词:

摘要: The parallelization of a Modified PrefixSpan method is proposed in this paper. used to extract the frequent pattern from sequence database. This system developed by authors requires use multiple computers connected local area network. system, which has dynamic load balancing mechanism, achieved through communication among using socket and an MPI library. It also includes multi-threads achieve between master process slave processes. controls both global job pool, manage set subtrees generated initial processing results obtained here indicated that 8 were approximately 6 times faster than 1 computer trial implementation experiments.

参考文章(10)
Charles Elkan, Timothy L. Bailey, Fitting a mixture model by expectation maximization to discover motifs in biopolymers. intelligent systems in molecular biology. ,vol. 2, pp. 28- 36 ,(1994)
Yasuma Mori, Susumu Kuroki, Yukiko Yamazaki, Hajime Kitakami, Tomoki Kanbara, Modified PrefixSpan Method for Motif Discovery in Sequence Databases pacific rim international conference on artificial intelligence. pp. 482- 491 ,(2002) , 10.1007/3-540-45683-X_52
I. Rigoutsos, A. Floratos, Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm. Bioinformatics. ,vol. 14, pp. 55- 67 ,(1998) , 10.1093/BIOINFORMATICS/14.1.55
Inge Jonassen, John F. Collins, Desmond G. Higgins, Finding flexible patterns in unaligned protein sequences. Protein Science. ,vol. 4, pp. 1587- 1595 ,(1995) , 10.1002/PRO.5560040817
Isidore Rigoutsos, Aris Floratos, Motif discovery without alignment or enumeration (extended abstract) research in computational molecular biology. pp. 221- 227 ,(1998) , 10.1145/279069.279118
A. Bairoch, P. Bucher, K. Hofmann, The PROSITE database, its status in 1997 Nucleic Acids Research. ,vol. 25, pp. 217- 221 ,(1997) , 10.1093/NAR/25.1.217
Erik L.L. Sonnhammer, Sean R. Eddy, Richard Durbin, Pfam: A comprehensive database of protein domain families based on seed alignments Proteins: Structure, Function, and Genetics. ,vol. 28, pp. 405- 420 ,(1997) , 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L
Jian Pei, Jiawei Han, B. Mortazavi-Asl, H. Pinto, Qiming Chen, U. Dayal, Mei-Chun Hsu, PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth international conference on data engineering. pp. 215- 224 ,(2001) , 10.1109/ICDE.2001.914830
Pacific Rim International Conference on Artificial Intelligence, PRICAI 2002: Trends in Artificial Intelligence Springer Berlin Heidelberg. ,(2002) , 10.1007/3-540-45683-X
Amos Bairoch, Philipp Bucher, Kay Hofmann, The PROSITE Database, Its Status in 1995 Nucleic Acids Research. ,vol. 24, pp. 189- 196 ,(1996) , 10.1093/NAR/24.1.189