作者: José M Abuín , Juan C Pichel , Tomás F Pena , Jorge Amigo , None
DOI: 10.1371/JOURNAL.PONE.0155461
关键词:
摘要: Next-generation sequencing (NGS) technologies have led to a huge amount of genomic data that need be analyzed and interpreted. This fact has impact on the DNA sequence alignment process, which nowadays requires mapping billions small sequences onto reference genome. In this way, remains most time-consuming stage in analysis workflow. To deal with issue, state art aligners take advantage parallelization strategies. However, existent solutions show limited scalability complex implementation. work we introduce SparkBWA, new tool exploits capabilities big technology as Spark boost performance one widely adopted aligner, Burrows-Wheeler Aligner (BWA). The design SparkBWA uses two independent software layers such way no modifications original BWA source code are required, assures its compatibility any version (future or legacy). is evaluated different scenarios showing noticeable results terms scalability. A comparison other parallel BWA-based validates benefits our approach. Finally, an intuitive flexible API provided NGS professionals order facilitate acceptance adoption tool. described paper publicly available at https://github.com/citiususc/SparkBWA, GPL3 license.