The genetic code can cause systematic bias in simple phylogenetic models

作者: Simon Whelan

DOI: 10.1098/RSTB.2008.0171

关键词:

摘要: Phylogenetic analysis depends on inferential methodology estimating accurately the degree of divergence between sequences. Inaccurate estimates can lead to misleading evolutionary inferences, including incorrect tree topology and poor dating historical species divergence. Protein coding sequences are ubiquitous in phylogenetic inference, but many standard methods commonly used describe their evolution do not explicitly account for dependencies sites a codon induced by genetic code. This study evaluates performance several datasets simulated under simple substitution model, describing range different types selective pressures. approach also offers insights into relative when there acting data. Methods based statistical models performed well was no or limited purifying selection (low dependency codon), although more biologically realistic tended outperform simpler models. exhibited greater variability strong (high codon). Simple substantially underestimate sequences, underestimation pronounced internal branches tree. resulted some performing poorly exhibiting evidence systematic bias inference. Amino acid-based nucleotide that contained generic descriptions spatial temporal heterogeneity, such as mixture hidden Markov models, coped notably better, producing accurate topology.

参考文章(30)
J. Felsenstein, Cases in which Parsimony or Compatibility Methods will be Positively Misleading Systematic Biology. ,vol. 27, pp. 401- 410 ,(1978) , 10.1093/SYSBIO/27.4.401
J. P. Huelsenbeck, D. M. Hillis, Success of phylogenetic methods in the four-taxon case Systematic Biology. ,vol. 42, pp. 247- 264 ,(1993) , 10.1093/SYSBIO/42.3.247
Bryan Kolaczkowski, Joseph W. Thornton, Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous Nature. ,vol. 431, pp. 980- 984 ,(2004) , 10.1038/NATURE02917
Antonis Rokas, Dirk Kruger, Sean B Carroll, Animal Evolution and the Molecular Signature of Radiations Compressed in Time Science. ,vol. 310, pp. 1933- 1938 ,(2005) , 10.1126/SCIENCE.1116759
Elizabeth S. Allman, John A. Rhodes, The identifiability of tree topology for phylogenetic models, including covarion and mixture models. Journal of Computational Biology. ,vol. 13, pp. 1101- 1113 ,(2006) , 10.1089/CMB.2006.13.1101
S. Whelan, Spatial and Temporal Heterogeneity in Nucleotide Sequence Evolution Molecular Biology and Evolution. ,vol. 25, pp. 1683- 1694 ,(2008) , 10.1093/MOLBEV/MSN119
Sandra L Baldauf, Andrew J Roger, Ingrid Wenk-Siefert, W Ford Doolittle, A Kingdom-Level Phylogeny of Eukaryotes Based on Combined Protein Data Science. ,vol. 290, pp. 972- 977 ,(2000) , 10.1126/SCIENCE.290.5493.972
Michael Schöniger, Arndt Von Haeseler, A Stochastic Model for the Evolution of Autocorrelated DNA Sequences Molecular Phylogenetics and Evolution. ,vol. 3, pp. 240- 247 ,(1994) , 10.1006/MPEV.1994.1026
Frédéric Delsuc, Henner Brinkmann, Hervé Philippe, Phylogenomics and the reconstruction of the tree of life. Nature Reviews Genetics. ,vol. 6, pp. 361- 375 ,(2005) , 10.1038/NRG1603