BFAST: An Alignment Tool for Large Scale Genome Resequencing

作者: Nils Homer , Barry Merriman , Stanley F. Nelson

DOI: 10.1371/JOURNAL.PONE.0007767

关键词:

摘要: Background The new generation of massively parallel DNA sequencers, combined with the challenge whole human genome resequencing, result in need for rapid and accurate alignment billions short sequence reads to a large reference genome. Speed is obviously great importance, but equally important maintaining accuracy reads, 25–100 base range, presence errors true biological variation. Methodology We introduce algorithm specifically optimized this task, as well freely available implementation, BFAST, which can align data produced by any current sequencing platforms, allows user-customizable levels speed accuracy, supports paired end data, provides efficient multi-threaded computation on computer cluster. The method based creating flexible, indexes rapidly map candidate locations, arbitrary multiple independent allowed achieve robustness against read variants. final local uses Smith-Waterman method, gaps support detection small indels. Conclusions compare BFAST selection large-scale tools - BLAT, MAQ, SHRiMP, SOAP terms both using simulated real-world datasets. We show substantially greater sensitivity context variants, especially insertions deletions, minimize false mappings, while adequate compared other methods. amount needed fully resequence genome, one billion high modest cluster less than 24 hours. at http://bfast.sourceforge.net.

参考文章(19)
Nils Homer, Barry Merriman, Stanley F Nelson, Local alignment of two-base encoded DNA sequence BMC Bioinformatics. ,vol. 10, pp. 175- 175 ,(2009) , 10.1186/1471-2105-10-175
Yanni Sun, Jeremy Buhler, Designing multiple simultaneous seeds for DNA similarity search. Journal of Computational Biology. ,vol. 12, pp. 847- 861 ,(2005) , 10.1089/CMB.2005.12.847
T.F. Smith, M.S. Waterman, Identification of common molecular subsequences. Journal of Molecular Biology. ,vol. 147, pp. 195- 197 ,(1981) , 10.1016/0022-2836(81)90087-5
H. Li, R. Durbin, Fast and accurate short read alignment with Burrows–Wheeler transform Bioinformatics. ,vol. 25, pp. 1754- 1760 ,(2009) , 10.1093/BIOINFORMATICS/BTP324
L. Ilie, S. Ilie, Multiple spaced seeds for homology search Bioinformatics. ,vol. 23, pp. 2969- 2977 ,(2007) , 10.1093/BIOINFORMATICS/BTM422
David R Bentley, Whole-genome re-sequencing. Current Opinion in Genetics & Development. ,vol. 16, pp. 545- 552 ,(2006) , 10.1016/J.GDE.2006.10.009
H. Li, J. Ruan, R. Durbin, Mapping short DNA sequencing reads and calling variants using mapping quality scores Genome Research. ,vol. 18, pp. 1851- 1858 ,(2008) , 10.1101/GR.078212.108
D. R. Smith, A. R. Quinlan, H. E. Peckham, K. Makowsky, W. Tao, B. Woolf, L. Shen, W. F. Donahue, N. Tusneem, M. P. Stromberg, D. A. Stewart, L. Zhang, S. S. Ranade, J. B. Warner, C. C. Lee, B. E. Coleman, Z. Zhang, S. F. McLaughlin, J. A. Malek, J. M. Sorenson, A. P. Blanchard, J. Chapman, D. Hillman, F. Chen, D. S. Rokhsar, K. J. McKernan, T. W. Jeffries, G. T. Marth, P. M. Richardson, Rapid whole-genome mutational profiling using next-generation sequencing technologies Genome Research. ,vol. 18, pp. 1638- 1642 ,(2008) , 10.1101/GR.077776.108
Ben Langmead, Cole Trapnell, Mihai Pop, Steven L Salzberg, Ultrafast and memory-efficient alignment of short DNA sequences to the human genome Genome Biology. ,vol. 10, pp. 1- 10 ,(2009) , 10.1186/GB-2009-10-3-R25
Marcel Margulies, Michael Egholm, William E Altman, Said Attiya, Joel S Bader, Lisa A Bemben, Jan Berka, Michael S Braverman, Yi-Ju Chen, Zhoutao Chen, Scott B Dewell, Lei Du, Joseph M Fierro, Xavier V Gomes, Brian C Godwin, Wen He, Scott Helgesen, Chun He Ho, Gerard P Irzyk, Szilveszter C Jando, Maria LI Alenquer, Thomas P Jarvie, Kshama B Jirage, Jong-Bum Kim, James R Knight, Janna R Lanza, John H Leamon, Steven M Lefkowitz, Ming Lei, Jing Li, Kenton L Lohman, Hong Lu, Vinod B Makhijani, Keith E McDade, Michael P McKenna, Eugene W Myers, Elizabeth Nickerson, John R Nobile, Ramona Plant, Bernard P Puc, Michael T Ronan, George T Roth, Gary J Sarkis, Jan Fredrik Simons, John W Simpson, Maithreyan Srinivasan, Karrie R Tartaro, Alexander Tomasz, Kari A Vogt, Greg A Volkmer, Shally H Wang, Yong Wang, Michael P Weiner, Pengguang Yu, Richard F Begley, Jonathan M Rothberg, None, Genome sequencing in microfabricated high-density picolitre reactors Nature. ,vol. 437, pp. 376- 380 ,(2005) , 10.1038/NATURE03959