作者: Susanne Mignon Balzer
DOI:
关键词:
摘要: Motivation: The commercial launch of 454 pyrosequencing in 2005 was a milestone genome sequencing terms performance and cost. Throughout the three available releases, average read lengths have increased to ∼500 base pairs are thus approaching obtained from traditional Sanger sequencing. Study design projects would benefit being able simulate experiments. Results: We explore raw data investigate its characteristics derive empirical distributions for flow values generated by pyrosequencing. Based on our findings, we implement Flowsim, simulator that generates realistic files arbitrary size given set input DNA sequences. finally use examine impact sequence results concrete whole-genome assemblies, suggest planning projects, benchmarking assembly methods other fields. Availability: Flowsim is freely under General Public License http://blog.malde.org/index.php/flowsim/ Contact: susanne.balzer@imr.no; ketil.malde@imr.no