作者: Xiaoqing Yu , Kishore Guda , Joseph Willis , Martina Veigl , Zhenghe Wang
关键词: Simulated data 、 DNA sequencing 、 Hybrid genome assembly 、 Computer science 、 Sequencing data 、 Data mining 、 Repetitive Regions 、 Quality (business) 、 Genome
摘要: Background Next-generation sequencing technologies generate a significant number of short reads that are utilized to address variety biological questions. However, quite often, tend have low quality at the 3’ end and generated from repetitive regions genome. It is unclear how different alignment programs perform under these cases. In order investigate this question, we use both real data simulated with above issues evaluate performance four commonly used algorithms: SOAP2, Bowtie, BWA, Novoalign.