作者: Naiara Rodríguez-Ezpeleta , Henner Brinkmann , Béatrice Roure , Nicolas Lartillot , B. Franz Lang
DOI: 10.1080/10635150701397643
关键词:
摘要: Genome-scale data sets result in an enhanced resolution of the phylogenetic inference by reducing stochastic errors. However, there is also increase systematic errors due to model violations, which can lead erroneous phylo- genies. Here, we explore impact on eukaryotic phylogeny using a set 143 nuclear-encoded proteins from 37 species. The initial observation was that, despite impressive amount data, some branches had no significant statistical support. To demonstrate that this lack mutual annihilation and nonphylogenetic signals, created series with slightly different taxon sampling. As expected, these yielded strongly supported but mutually exclusive trees, thus confirming presence con- flicting signals original set. decide correct tree, applied several methods expected reduce kinds error. Briefly, show (i) removing fast-evolving positions, (ii) recoding amino acids into functional categories, (iii) site-heterogeneous mixture (CAT) are three effective means increasing ratio signal. Finally, our results allow us formulate guidelines for detecting overcoming artefacts genome-scale analyses. (Compo- sitional heterogeneity; removal; phylogeny; inconsistency; long-branch attraction; signal; phylogenomics; error.)