作者: Christophe Dessimoz , Manuel Gil
关键词:
摘要: Background: The estimation of a distance between two biological sequences is fundamental process in molecular evolution. It usually performed by maximum likelihood (ML) on characters aligned either pairwise or jointly multiple sequence alignment (MSA). Estimators for the covariance pairs from an MSA are known, but we not aware any solution cases independently. In large-scale analyses, it may be too costly to compute MSAs every time distances must compared, and therefore estimator estimated independently desirable. Knowledge covariances improves that compares combines distances, such as generalized least-squares phylogenetic tree building, orthology inference, lateral gene transfer detection. Results: this paper, introduce pairwise. Its performance analyzed through extensive Monte Carlo simulations, compared well-known variance ML distances. Our can used together with form matrices. Conclusion: performs similarly estimator. particular, shows no sign bias when divergence below 150 PAM units (i.e. above ~29% expected identity). Above distance, tend underestimated, then variances also underestimated.