作者: J. Butler , I. MacCallum , M. Kleber , I. A. Shlyakhter , M. K. Belmonte
DOI: 10.1101/GR.7337908
关键词:
摘要: New DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution challenge de novo assembly from whole-genome shotgun “microreads.” For 11 genomes sizes up 39 Mb, we generated high-quality assemblies 80× coverage by paired 30-base simulated modeled after real Illumina-Solexa reads. The bacterial Campylobacter jejuni and Escherichia coli assemble optimally, yielding single perfect contigs, larger yield are highly connected accurate. Assemblies presented in a graph form retains intrinsic ambiguities such as those arising polymorphism, thereby providing information has been absent previous genome assemblies. both C. E. coli, this is edge encompassing entire genome. Larger produce more complicated graphs, vast majority bases their present long edges nearly always perfect. describe general method for can be applied all types sequence data, not only read also conventional