An expert system for processing sequence homology data.

作者: Erik L. L. Sonnhammer , Richard Durbin

DOI:

关键词:

摘要: When confronted with the task of finding homology to large numbers sequences, database searching tools such as Blast and Fasta generate prohibitively amounts information. An automatic way making most decisions a trained sequence analyst would make was developed by means rule-based expert system combined an algorithm avoid non-informative biased residue composition matches. The results found relevant are presented in very concise clear way, so that can be assessed minimum effort. system, HSPcrunch, implemented process output programs BLAST suite. HSPcrunch embodies rules on detecting distant similarities when pairs weak matches consistent larger gapped alignment, i.e. has broken longer alignment up into smaller ungapped ones. This more detected no or little side-effects spurious for how small gaps must considered significant have been derived empirically. Currently set used operate two different scoring levels, one medium slightly gaps. proved robust cases gives high fidelity separation between real homologies One important reducing amount is limit number overlapping same region query sequence.(ABSTRACT TRUNCATED AT 250 WORDS)

参考文章(8)
Steve G Oliver, Quirina JM van der Aart, Maria L Agostoni-Carbone, Michel Aigle, Lilia Alberghina, Despina Alexandraki, G Antoine, R Anwar, JPG Ballesta, P Benit, G Berben, Elisabetta Bergantino, N Biteau, PA Bolle, M Bolotin-Fukuhara, A Brown, AJP Brown, JM Buhler, C Carcano, G Carignani, H Cederberg, R Chanet, R Contreras, M Crouzet, B Daignan-Fornier, E Defoor, M Delgado, J Demolder, C Doira, E Dubois, B Dujon, A Dusterhoft, D Erdmann, M Esteban, F Fabre, C Fairhead, G Faye, H Feldmann, W Fiers, MC Francingues-Gaillard, L Franco, L Frontali, H Fukuhara, LJ Fuller, P Galland, ME Gent, D Gigot, V Gilliquet, N Glansdorff, A Goffeau, M Grenson, P Grisanti, LA Grivell, M De Haan, M Haasemann, D Hatat, J Hoenicka, J Hegemann, CJ Herbert, F Hilger, S Hohmann, CP Hollenberg, K Huse, F Iborra, KJ Indje, K Isono, C Jacq, M Jacquet, CM James, JC Jauniaux, Y Jia, A Jimenez, A Kelly, U Kleinhans, P Kreisl, Gerolamo Lanfranchi, C Lewis, CG Vanderlinden, G Lucchini, K Lutzenkirchen, MJ Maat, L Mallet, G Mannhaupet, E Martegani, A Mathieu, CTC Maurer, D McConnell, RA McKee, F Messenguy, HW Mewes, F Molemans, MA Montague, M Muzi Falconi, L Navas, CS Newlon, D Noone, C Pallier, L Panzeri, BM Pearson, J Perea, P Philippsen, A Pierard, RJ Planta, P Plevani, B Poetsch, F Pohl, B Purnelle, M Ramezani Rad, SW Rasmussen, A Raynal, M Remacha, P Richterich, AB Roberts, F Rodriguez, E Sanz, I Schaaff-Gerstenschlager, B Scherens, B Schweitzer, Y Shu, J Skala, PP Slonimski, F Sor, C Soustelle, R Spiegelberg, LI Stateva, HY Steensma, S Steiner, A Thierry, G Thireos, M Tzermia, LA Urrestarazu, Giorgio Valle, I Vetter, JC van Vliet-Reedijk, M Voet, G Volckaert, P Vreken, H Wang, JR Warmington, D Von Wettstein, BL Wicksteed, C Wilson, H Wurst, G Xu, A Yoshikawa, FK Zimmermann, JG Sgouros, None, The complete DNA sequence of yeast chromosome III. Nature. ,vol. 357, pp. 38- 46 ,(1992) , 10.1038/357038A0
S. Karlin, S. F. Altschul, Applications and statistics for multiple high-scoring segments in molecular sequences. Proceedings of the National Academy of Sciences of the United States of America. ,vol. 90, pp. 5873- 5877 ,(1993) , 10.1073/PNAS.90.12.5873
P Green, D Lipman, L Hillier, R Waterston, D States, J. Claverie, Ancient conserved regions in new gene sequences and the protein databases. Science. ,vol. 259, pp. 1711- 1716 ,(1993) , 10.1126/SCIENCE.8456298
John C. Wootton, Scott Federhen, Statistics of local complexity in amino acid sequences and sequence databases Computational Biology and Chemistry. ,vol. 17, pp. 149- 163 ,(1993) , 10.1016/0097-8485(93)85006-X
S Altschula, Warren Gisha, Webb Millerb, E Meyersc, D Lipmana, None, Basic Local Alignment Search Tool Journal of Molecular Biology. ,vol. 215, pp. 403- 410 ,(1990) , 10.1016/S0022-2836(05)80360-2
Jean-Michel Claverie, David J. States, Information enhancement methods for large scale sequence analysis Computational Biology and Chemistry. ,vol. 17, pp. 191- 201 ,(1993) , 10.1016/0097-8485(93)85010-A
Anders Krogh, Michael Brown, I.Saira Mian, Kimmen Sjölander, David Haussler, HIDDEN MARKOV MODELS IN COMPUTATIONAL BIOLOGY: APPLICATIONS TO PROTEIN MODELING Journal of Molecular Biology. ,vol. 235, pp. 1501- 1531 ,(1993) , 10.1006/JMBI.1994.1104
D. George, The PIR-International Protein Sequence Database Nucleic Acids Research. ,vol. 27, pp. 39- 43 ,(1992) , 10.1093/NAR/24.1.17