作者: Yoko Sogabe , Tsutomu Maruyama
关键词:
摘要: The rapid development of Next Generation Sequencing (NGS) has enabled to generate more than 100G base pairs per day from one machine. produced data are randomly fragmented DNA pair strings, called short reads, and millions reads mapped onto the reference genomes, which complete genetic sequences, reconstruct sequence sample DNA. This read mapping is becoming bottle-neck NGS systems. In this paper, we propose an FPGA system for based on a hash-index method. our system, divided into seeds, fixed-length substrings used mapping, seeds sorted using buckets. Then, in each bucket compared parallel with candidate locations. With approach, many can be massively manner their locations, it becomes possible improve processing speed by reducing number random accesses DRAM banks store Furthermore, substitutions nucleotides seed allowed comparison. makes achieve higher matching rates previous works.