An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data

作者: Huan Fan , Anthony R. Ives , Yann Surget-Groba , Charles H. Cannon

DOI: 10.1186/S12864-015-1647-5

关键词:

摘要: Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction genomic data remains difficult because de novo assembly non-model genomes and multi-genome alignment challenging. To greatly simplify the analysis, we present Assembly Alignment-Free (AAF) method ( https://sourceforge.net/projects/aaf-phylogeny ) that constructs phylogenies directly from unassembled genome sequence data, bypassing both alignment. Using mathematical calculations, models evolution, simulated published genomes, address evolutionary sampling issues caused by direct reconstruction, including homoplasy, errors, incomplete coverage. From these results, calculate statistical properties pairwise distances between allowing us to optimize parameter selection perform bootstrapping. As a test case with real successfully reconstructed phylogeny 12 mammals using raw reads. We also applied AAF 21 tropical tree low coverage demonstrate its effectiveness on Our opens up phylogenomics species without appropriate reference or high coverage, creates framework further analysis structure diversity among

参考文章(48)
THOMAS H. JUKES, CHARLES R. CANTOR, CHAPTER 24 – Evolution of Protein Molecules Mammalian Protein Metabolism#R##N#Volume III. pp. 21- 132 ,(1969) , 10.1016/B978-1-4832-3211-9.50009-7
Gesine Reinert, David Chew, Fengzhu Sun, Michael S. Waterman, Alignment-free sequence comparison (I): statistics and power. Journal of Computational Biology. ,vol. 16, pp. 1615- 1634 ,(2009) , 10.1089/CMB.2009.0198
Igor Ulitsky, David Burstein, Tamir Tuller, Benny Chor, The average common substring approach to phylogenomic reconstruction. Journal of Computational Biology. ,vol. 13, pp. 336- 350 ,(2006) , 10.1089/CMB.2006.13.336
Chai-Shian Kua, Jue Ruan, John Harting, Cheng-Xi Ye, Matthew R. Helmus, Jun Yu, Charles H. Cannon, Reference-free comparative genomics of 174 Chloroplasts. PLOS ONE. ,vol. 7, ,(2012) , 10.1371/JOURNAL.PONE.0048995
SEBASTIAN MAURER-STROH, VITHIAGARAN GUNALAN, WING-CHEONG WONG, FRANK EISENHABER, A SIMPLE SHORTCUT TO UNSUPERVISED ALIGNMENT-FREE PHYLOGENETIC GENOME GROUPINGS, EVEN FROM UNASSEMBLED SEQUENCING READS Journal of Bioinformatics and Computational Biology. ,vol. 11, pp. 1343005- ,(2013) , 10.1142/S0219720013430051
R. A. Lippert, H. Huang, M. S. Waterman, Distributional regimes for the number of k-word matches between two random sequences Proceedings of the National Academy of Sciences of the United States of America. ,vol. 99, pp. 13980- 13989 ,(2002) , 10.1073/PNAS.202468099
Kai Song, Jie Ren, Zhiyuan Zhai, Xuemei Liu, Minghua Deng, Fengzhu Sun, Alignment-free sequence comparison based on next-generation sequencing reads. Journal of Computational Biology. ,vol. 20, pp. 64- 79 ,(2013) , 10.1089/CMB.2012.0228