Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

作者: Alexander F Auch , Stefan R Henz , Barbara R Holland , Markus Göker

DOI: 10.1186/1471-2105-7-350

关键词: Selection (genetic algorithm)Sequence alignmentBiologyGeneticsSequence analysisGenomePhylogenetic treeDistance matrices in phylogenyComputational biologyPhylogeneticsUPGMA

摘要: Background: Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP) strategy to compute phylogenetic all plastid genomes currently available and a selection of mitochondrial representing major eukaryotic lineages. BLASTN, TBLASTX, or combinations both used locate high-scoring segment pairs (HSPs) between two sequences pairwise similarities distances computed different ways resulting total 96 GBDP variants. The suitability these distance formulae for phylogeny reconstruction is estimated by computing measure "treelikeness", so-called δ value, respective matrices. Additionally, compare inferred matrices using UPGMA, NJ, BIONJ, FastME, STC, respectively, with NCBI taxonomy tree taxa under study. Results: Our results indicate that, at this taxonomic level, much more valuable phylogenies than genomes, that based breakpoints little use. Distances proportion "matched" HSP length average genome were best estimation. Additionally found TBLASTX instead BLASTN and, particularly, combining leads small but significant increase accuracy. Other factors significantly affect outcome. BIONJ algorithm most accordance current taxonomy, NJ FastME performing insignificantly worse, STC as well if applied high quality values be reliable predictor Conclusion: Using treelike matrices, judged their values, able recover plant lineages, Apicomplexa organelles being derived "green" plastids "red" type. GBDP-like can reliably infer kinds genomic data. A framework established further develop improve such methods. topologyindependent tool general use development assessment inference.

参考文章(73)
L. P. Lefkovitch, Optimal set covering for biological classification. Optimal set covering for biological classification.. ,(1993)
Le Sy Vinh, Arndt von Haeseler, Shortest triplet clustering: reconstructing large phylogenies using representative sets BMC Bioinformatics. ,vol. 6, pp. 92- 92 ,(2005) , 10.1186/1471-2105-6-92
Berend Snel, Peer Bork, Martijn A. Huynen, Genome phylogeny based on gene content Nature Genetics. ,vol. 21, pp. 108- 110 ,(1999) , 10.1038/5052
A note on the neighbor-joining algorithm of Saitou and Nei. Molecular Biology and Evolution. ,vol. 5, pp. 729- 731 ,(1988) , 10.1093/OXFORDJOURNALS.MOLBEV.A040527
B. R. Holland, K. T. Huber, A. Dress, V. Moulton, δ Plots: A Tool for Analyzing Phylogenetic Distance Data Molecular Biology and Evolution. ,vol. 19, pp. 2051- 2059 ,(2002) , 10.1093/OXFORDJOURNALS.MOLBEV.A004030
Li-San Wang, Robert K. Jansen, Bernard M. E. Moret, Linda A. Raubeson, Tandy Warnow, Fast phylogenetic methods for the analysis of genome rearrangement data: an empirical study. pacific symposium on biocomputing. pp. 524- 535 ,(2001) , 10.1142/9789812799623_0049
T. D. Pham, J. Zuegg, A probabilistic measure for alignment-free sequence comparison Bioinformatics. ,vol. 20, pp. 3455- 3461 ,(2004) , 10.1093/BIOINFORMATICS/BTH426
A. B. MEYER, Exploration of Timor Nature. ,vol. 21, pp. 108- 108 ,(1879) , 10.1038/021108D0