Bayesian Genome Assembly and Assessment by Markov Chain Monte Carlo Sampling

作者: Mark Howison , Felipe Zapata , Erika J. Edwards , Casey W. Dunn

DOI: 10.1371/JOURNAL.PONE.0099497

关键词:

摘要: Most genome assemblers construct point estimates, choosing only a single sequence from among many alternative hypotheses that are supported by the data. We present Markov chain Monte Carlo approach to assembly instead generates distributions of with posterior probabilities, providing an explicit statistical framework for evaluating and assessing uncertainty. implement this in prototype assembler, called Genome Assembly Bayesian Inference (GABI), illustrate its application bacteriophage X174. Our sampling strategy achieves both good mixing convergence on Illumina test data X174, demonstrating feasibility our approach. summarize distribution generated GABI as majority-rule consensus assembly. Then we compare external assemblies same data, annotate those assigning probabilities features common GABI’s graph. is freely available under GPL license https://bitbucket.org/mhowison/gabi.

参考文章(23)
Charles J. Geyer, Markov Chain Monte Carlo Maximum Likelihood Interface Foundation of North America. ,(1991)
Eugene W Myers, None, Toward Simplifying and Accurately Formulating Fragment Assembly Journal of Computational Biology. ,vol. 2, pp. 275- 290 ,(1995) , 10.1089/CMB.1995.2.275
Ernest T Lam, Alex Hastie, Chin Lin, Dean Ehrlich, Somes K Das, Michael D Austin, Paru Deshpande, Han Cao, Niranjan Nagarajan, Ming Xiao, Pui-Yan Kwok, Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly Nature Biotechnology. ,vol. 30, pp. 771- 776 ,(2012) , 10.1038/NBT.2303
G. O. Roberts, A. Gelman, W. R. Gilks, Weak convergence and optimal scaling of random walk Metropolis algorithms Annals of Applied Probability. ,vol. 7, pp. 110- 120 ,(1997) , 10.1214/AOAP/1034625254
Paul Medvedev, Michael Brudno, Maximum Likelihood Genome Assembly Journal of Computational Biology. ,vol. 16, pp. 1101- 1116 ,(2009) , 10.1089/CMB.2009.0047
Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, Edward Teller, Equation of State Calculations by Fast Computing Machines The Journal of Chemical Physics. ,vol. 21, pp. 1087- 1092 ,(1953) , 10.1063/1.1699114
Mark Holder, Paul O. Lewis, Phylogeny estimation: traditional and Bayesian approaches Nature Reviews Genetics. ,vol. 4, pp. 275- 284 ,(2003) , 10.1038/NRG1044
Aditya Varma, Abhiram Ranade, Srinivas Aluru, An improved maximum likelihood formulation for accurate genome assembly international conference on computational advances in bio and medical sciences. pp. 165- 170 ,(2011) , 10.1109/ICCABS.2011.5729873
Scott C. Clark, Rob Egan, Peter I. Frazier, Zhong Wang, ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies Bioinformatics. ,vol. 29, pp. 435- 443 ,(2013) , 10.1093/BIOINFORMATICS/BTS723
Nicolas Lartillot, Thomas Lepage, Samuel Blanquart, PhyloBayes 3 Bioinformatics. ,vol. 25, pp. 2286- 2288 ,(2009) , 10.1093/BIOINFORMATICS/BTP368