DNA +Pro : an Improved Progressive Multiple Sequence Alignment Algorithm for Evolutionary Analysis Using Combined DNA-Protein Sequences

作者: Xiaolong Wang , Shuang-yong Xu , Deming Gou

DOI: 10.1038/NPRE.2010.4898.1

关键词: Sequence alignmentDNA sequencingDNAPhylogenetic treeMultiple sequence alignmentAlgorithmSequence (medicine)Computer scienceRestriction enzymePhylogenetics

摘要: Alignment of DNA and protein sequences is a basic tool in the study evolutionary, structural functional relationship among macromolecules. Present sequence alignment methods are somewhat error-prone, often producing systematic bias. Errors alignments sometimes lead to subsequent misinterpretation information genes, proteins genomes. In traditional algorithms, conducted separately. It has been long believed that phylogenetic signal disappears more rapidly from than encoded proteins. therefore generally preferable align at amino acid level. Here we present new method—DNA^+Pro^, which aggregates into combined DNA-protein them fashion. We demonstrate combining improve quality multiple solve practical evolutionary problems primate immunodeficiency virus bacterial restriction enzymes. addition increased theoretical contents, distance estimations biological significant only or alignments. By integrating buried separately sequences, DNA^+Pro^ improves accuracy closely-related prevents certain errors may occur phylogeny analysis using approaches. The software supplementary data downloadable free charge "our website, http://www.dnapluspro.com":http://www.dnapluspro.com.

参考文章(32)
Steven Henikoff, Jorja G Henikoff, Amino acid substitution matrices Advances in Protein Chemistry. ,vol. 54, pp. 73- 97 ,(2000) , 10.1016/S0065-3233(00)54003-0
Jotun Hein, Jens Støvlbæk, Combined DNA and protein alignment. Methods in Enzymology. ,vol. 266, pp. 402- 418 ,(1996) , 10.1016/S0076-6879(96)66025-X
EMILE ZUCKERKANDL, LINUS PAULING, Evolutionary Divergence and Convergence in Proteins Evolving Genes and Proteins#R##N#A Symposium Held at the Institute of Microbiology of Rutgers: the State University with Support from the National Science Foundation. pp. 97- 166 ,(1965) , 10.1016/B978-1-4832-2734-4.50017-6
A. Loytynoja, N. Goldman, An algorithm for progressive multiple alignment of sequences with insertions Proceedings of the National Academy of Sciences of the United States of America. ,vol. 102, pp. 10557- 10562 ,(2005) , 10.1073/PNAS.0409137102
Bing Sun, Jacob T. Schwartz, Ofer H. Gill, Bud Mishra, COMBAT: search rapidly for highly similar protein-coding sequences using bipartite graph matching international conference on computational science. pp. 654- 661 ,(2006) , 10.1007/11758525_89
Jon P. Anderson, Allen G. Rodrigo, Gerald H. Learn, Anup Madan, Claire Delahunty, Michael Coon, Marc Girard, Saladin Osmanov, Leroy Hood, James I. Mullins, Testing the Hypothesis of a Recombinant Origin of Human Immunodeficiency Virus Type 1 Subtype E Journal of Virology. ,vol. 74, pp. 10752- 10765 ,(2000) , 10.1128/JVI.74.22.10752-10765.2000
MIKA O. SALMINEN, JEAN K. CARR, DONALD S. BURKE, FRANCINE E. McCUTCHAN, Identification of Breakpoints in Intergenotypic Recombinants of HIV Type 1 by Bootscanning AIDS Research and Human Retroviruses. ,vol. 11, pp. 1423- 1425 ,(1995) , 10.1089/AID.1995.11.1423
OLAF WEISS, MIGUEL A JIMÉNEZ-MONTAÑO, HANSPETER HERZEL, Information content of protein sequences. Journal of Theoretical Biology. ,vol. 206, pp. 379- 386 ,(2000) , 10.1006/JTBI.2000.2138
G. Gonnet, M. Cohen, S. Benner, Exhaustive matching of the entire protein sequence database Science. ,vol. 256, pp. 1443- 1445 ,(1992) , 10.1126/SCIENCE.1604319