RazerS 3

作者: David Weese , Manuel Holtgrewe , Knut Reinert

DOI: 10.1093/BIOINFORMATICS/BTS505

关键词: Key (cryptography)Throughput (business)Sensitivity (control systems)Filter (video)Index (publishing)Computer engineeringSource codeTheoretical computer scienceComputer science

摘要: Motivation: During the past years, next-generation sequencing has become a key technology for many applications in biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is first step. Hence, it crucial have algorithms implementations that perform fast, with high sensitivity, are able deal long large absolute number of insertions deletions. Results: RazerS program adjustable sensitivity based on counting q-grams. this work, we propose successor 3, which now supports shared-memory parallelism, an additional seed-based filter much faster, banded version Myers’ bit-vector algorithm verification, memory-saving measures support SAM output format. This leads improved performance reads, particular, errors. We extensively compare 3 other popular mappers show its results often superior them terms while exhibiting practical competitive run times. addition, works without pre-computed index. Availability Implementation: Source code binaries freely available download at http://www.seqan.de/projects/razers. implemented C++ OpenMP under GPL license using SeqAn library Linux, Mac OS X Windows. Contact: david.weese@fu-berlin.de Supplementary information:Supplementary data Bioinformatics online.

参考文章(21)
Heikki Hyyrö, A bit-vector algorithm for computing Levenshtein and Damerau edit distances prague stringology conference. ,vol. 10, pp. 29- 39 ,(2003)
Steve Hoffmann, Christian Otto, Stefan Kurtz, Cynthia M. Sharma, Philipp Khaitovich, Jörg Vogel, Peter F. Stadler, Jörg Hackermüller, Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures PLoS Computational Biology. ,vol. 5, pp. e1000502- ,(2009) , 10.1371/JOURNAL.PCBI.1000502
Kim R. Rasmussen, Jens Stoye, Eugene W. Myers, Efficient Q-Gram Filters for Finding All Epsilon-Matches Over a Given Length Journal of Computational Biology. ,vol. 13, pp. 296- 308 ,(2006) , 10.1089/CMB.2006.13.296
Esko Ukkonen, Finding approximate patterns in strings Journal of Algorithms. ,vol. 6, pp. 132- 137 ,(1985) , 10.1016/0196-6774(85)90023-9
Osamu Gotoh, An improved algorithm for matching biological sequences Journal of Molecular Biology. ,vol. 162, pp. 705- 708 ,(1982) , 10.1016/0022-2836(82)90398-9
Saul B. Needleman, Christian D. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins Journal of Molecular Biology. ,vol. 48, pp. 443- 453 ,(1970) , 10.1016/0022-2836(70)90057-4
Andreas Döring, David Weese, Tobias Rausch, Knut Reinert, SeqAn An efficient, generic C++ library for sequence analysis BMC Bioinformatics. ,vol. 9, pp. 11- 11 ,(2008) , 10.1186/1471-2105-9-11
H. Li, N. Homer, A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics. ,vol. 11, pp. 473- 483 ,(2010) , 10.1093/BIB/BBQ015
Ruiqiang Li, Chang Yu, Yingrui Li, Tak-Wah Lam, Siu-Ming Yiu, Karsten Kristiansen, Jun Wang, None, SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics. ,vol. 25, pp. 1966- 1967 ,(2009) , 10.1093/BIOINFORMATICS/BTP336