Covariance of maximum likelihood evolutionary distances between sequences aligned pairwise

作者: Christophe Dessimoz , Manuel Gil

DOI: 10.1186/1471-2148-8-179

关键词:

摘要: Background: The estimation of a distance between two biological sequences is fundamental process in molecular evolution. It usually performed by maximum likelihood (ML) on characters aligned either pairwise or jointly multiple sequence alignment (MSA). Estimators for the covariance pairs from an MSA are known, but we not aware any solution cases independently. In large-scale analyses, it may be too costly to compute MSAs every time distances must compared, and therefore estimator estimated independently desirable. Knowledge covariances improves that compares combines distances, such as generalized least-squares phylogenetic tree building, orthology inference, lateral gene transfer detection. Results: this paper, introduce pairwise. Its performance analyzed through extensive Monte Carlo simulations, compared well-known variance ML distances. Our can used together with form matrices. Conclusion: performs similarly estimator. particular, shows no sign bias when divergence below 150 PAM units (i.e. above ~29% expected identity). Above distance, tend underestimated, then variances also underestimated.

参考文章(19)
Christophe Dessimoz, Manuel Gil, Adrian Schneider, Gaston H Gonnet, Fast estimation of the difference between two PAM/JTT evolutionary distances in triplets of homologous sequences BMC Bioinformatics. ,vol. 7, pp. 529- 529 ,(2006) , 10.1186/1471-2105-7-529
Robert J Tibshirani, Bradley Efron, An introduction to the bootstrap ,(1993)
G. Gonnet, M. Cohen, S. Benner, Exhaustive matching of the entire protein sequence database Science. ,vol. 256, pp. 1443- 1445 ,(1992) , 10.1126/SCIENCE.1604319
E. Susko, Confidence Regions and Hypothesis Tests for Topologies Using Generalized Least Squares Molecular Biology and Evolution. ,vol. 20, pp. 862- 868 ,(2003) , 10.1093/MOLBEV/MSG093
Masami Hasegawa, Hirohisa Kishino, Taka-aki Yano, Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution. ,vol. 22, pp. 160- 174 ,(1985) , 10.1007/BF02101694
Steven A. Benner, Mark A. Cohen, Gaston H. Gonnet, Empirical and Structural Models for Insertions and Deletions in the Divergent Evolution of Proteins Journal of Molecular Biology. ,vol. 229, pp. 1065- 1082 ,(1993) , 10.1006/JMBI.1993.1105
T. F. DeLuca, I-H. Wu, J. Pu, T. Monaghan, L. Peshkin, S. Singh, D. P. Wall, Roundup: a multi-genome repository of orthologs and evolutionary distances Bioinformatics. ,vol. 22, pp. 2044- 2046 ,(2006) , 10.1093/BIOINFORMATICS/BTL286
T.F. Smith, M.S. Waterman, Identification of common molecular subsequences. Journal of Molecular Biology. ,vol. 147, pp. 195- 197 ,(1981) , 10.1016/0022-2836(81)90087-5
G. H. Gonnet, M. T. Hallett, C. Korostensky, L. Bernardin, Darwin v. 2.0: an interpreted computer language for the biosciences Bioinformatics. ,vol. 16, pp. 101- 103 ,(2000) , 10.1093/BIOINFORMATICS/16.2.101