SLAM: Cross-Species Gene Finding and Alignment with a Generalized Pair Hidden Markov Model

作者: Marina Alexandersson , Simon Cawley , Lior Pachter , None

DOI: 10.1101/GR.424203

关键词:

摘要: Comparative-based gene recognition is driven by the principle that conserved regions between related organisms are more likely than divergent to be coding. We describe a probabilistic framework for structure and alignment can used simultaneously find both of two syntenic genomic regions. A key feature method ability enhance predictions finding best sequences, while at same time biologically meaningful alignments preserve correspondence coding exons. Our generalized pair hidden Markov model, hybrid (1) models, which have been previously finding, (2) applications sequence alignment. built program called SLAM, aligns identifies complete exon/intron structures genes in but unannotated sequences DNA. SLAM able reliably predict any suitably organisms, most notably with fewer false-positive compared previous methods (examples provided Homo sapiens/Mus musculus andPlasmodium falciparum/Plasmodium vivax comparisons). Accuracy obtained distinguishing noncoding (CNS) from sequence. CNS annotation novel may useful UTRs, regulatory elements, other features.

参考文章(25)
Lior Samuel Pachter, Bonnie A. Berger, Domino tiling, gene recognition, and mice Massachusetts Institute of Technology. ,(1999)
Ross C. Hardison, John Oeltjen, Webb Miller, Long Human–Mouse Sequence Alignments Reveal Novel Regulatory Elements: A Reason to Sequence the Mouse Genome Genome Research. ,vol. 7, pp. 959- 966 ,(1997) , 10.1101/GR.7.10.959
Roderic Guigo, Pankaj Agarwal, Josep F Abril, Moisés Burset, James W Fickett, An Assessment of Gene Prediction Accuracy in Large DNA Sequences Genome Research. ,vol. 10, pp. 1631- 1642 ,(2000) , 10.1101/GR.122800
Moisès Burset, Roderic Guigó, Evaluation of Gene Structure Prediction Programs Genomics. ,vol. 34, pp. 353- 367 ,(1996) , 10.1006/GENO.1996.0298
Thomas Wiehe, Steffi Gebauer-Jung, Thomas Mitchell-Olds, Roderic Guigo, SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments Genome Research. ,vol. 11, pp. 1574- 1583 ,(2001) , 10.1101/GR.177401
Martin G Reese, David Kulp, Hari Tammana, David Haussler, Genie--gene finding in Drosophila melanogaster. Genome Research. ,vol. 10, pp. 529- 538 ,(2000) , 10.1101/GR.10.4.529
Len A Pennacchio, Michael Olivier, Jaroslav A Hubacek, Jonathan C Cohen, David R Cox, Jean-Charles Fruchart, Ronald M Krauss, Edward M Rubin, An Apolipoprotein Influencing Triglycerides in Humans and Mice Revealed by Comparative Sequencing Science. ,vol. 294, pp. 169- 173 ,(2001) , 10.1126/SCIENCE.1064852
Lior Pachter, Marina Alexandersson, Simon Cawley, Applications of generalized pair hidden Markov models to alignment and gene finding problems. Journal of Computational Biology. ,vol. 9, pp. 389- 399 ,(2002) , 10.1089/10665270252935520
Fumei Lam, Marina Alexandersson, Lior Pachter, None, Picking Alignments from (Steiner) Trees Journal of Computational Biology. ,vol. 10, pp. 509- 520 ,(2003) , 10.1089/10665270360688156
S Altschula, Warren Gisha, Webb Millerb, E Meyersc, D Lipmana, None, Basic Local Alignment Search Tool Journal of Molecular Biology. ,vol. 215, pp. 403- 410 ,(1990) , 10.1016/S0022-2836(05)80360-2