Benchmarking short sequence mapping tools

作者: Ayat Hatem , Doruk Bozdağ , Amanda E Toland , Ümit V Çatalyürek

DOI: 10.1186/1471-2105-14-184

关键词:

摘要: The development of next-generation sequencing instruments has led to the generation millions short sequences in a single run. process aligning these reads reference genome is time consuming and demands fast accurate alignment tools. However, current proposed tools make different compromises between accuracy speed mapping. Moreover, many important aspects are overlooked while comparing performance newly developed tool state art. Therefore, there need for an objective evaluation method that covers all aspects. In this work, we introduce benchmarking suite extensively analyze with respect various provide comparison. We applied our tests on 9 well known mapping tools, namely, Bowtie, Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, mrsFAST (mrFAST) using synthetic data real RNA-Seq data. MAQ RMAP based building hash tables reads, whereas remaining indexing genome. reveal strengths weaknesses each tool. results show no outperforms others metrics. Bowtie maintained best throughput most BWA performed better longer read lengths. not restricted mentioned can be further others. still hard problem affected by factors. provided reveals evaluates factors affecting process. Still, tests. end user should clearly specify his needs order choose provides results.

参考文章(41)
Sophie Schbath, Véronique Martin, Matthias Zytnicki, Julien Fayolle, Valentin Loux, Jean-François Gibrat, Mapping Reads on a Genomic Sequence: An Algorithmic Overview and a Practical Comparative Analysis Journal of Computational Biology. ,vol. 19, pp. 796- 813 ,(2012) , 10.1089/CMB.2012.0022
Manyuan Long, Michael Deutsch, Intron—exon structures of eukaryotic model organisms Nucleic Acids Research. ,vol. 27, pp. 3219- 3228 ,(1999) , 10.1093/NAR/27.15.3219
Sanchit Misra, Ramanathan Narayanan, Simon Lin, Alok Choudhary, FANGS Proceedings of the 2010 ACM Symposium on Applied Computing - SAC '10. pp. 1539- 1546 ,(2010) , 10.1145/1774088.1774419
Shawn J. Cokus, Suhua Feng, Xiaoyu Zhang, Zugen Chen, Barry Merriman, Christian D. Haudenschild, Sriharsa Pradhan, Stanley F. Nelson, Matteo Pellegrini, Steven E. Jacobsen, Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning Nature. ,vol. 452, pp. 215- 219 ,(2008) , 10.1038/NATURE06745
Faraz Hach, Fereydoun Hormozdiari, Can Alkan, Farhad Hormozdiari, Inanc Birol, Evan E Eichler, S Cenk Sahinalp, mrsFAST: a cache-oblivious algorithm for short-read mapping Nature Methods. ,vol. 7, pp. 576- 577 ,(2010) , 10.1038/NMETH0810-576
Nils Homer, Barry Merriman, Stanley F. Nelson, BFAST: An Alignment Tool for Large Scale Genome Resequencing PLoS ONE. ,vol. 4, pp. e7767- 12 ,(2009) , 10.1371/JOURNAL.PONE.0007767
Paul Flicek, Ewan Birney, Sense from sequence reads: methods for alignment and assembly Nature Methods. ,vol. 6, pp. 479- ,(2009) , 10.1038/NMETH.1376
Jinghui Zhang, David A Wheeler, Imtiaz Yakub, Sharon Wei, Raman Sood, William Rowe, Paul P Liu, Richard A Gibbs, Kenneth H Buetow, SNPdetector: a software tool for sensitive and accurate SNP detection PLOS Computational Biology. ,vol. 1, ,(2005) , 10.1371/JOURNAL.PCBI.0010053
Andrew D Smith, Zhenyu Xuan, Michael Q Zhang, Using quality scores and longer reads improves accuracy of Solexa read mapping BMC Bioinformatics. ,vol. 9, pp. 128- 128 ,(2008) , 10.1186/1471-2105-9-128
H. Li, N. Homer, A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics. ,vol. 11, pp. 473- 483 ,(2010) , 10.1093/BIB/BBQ015