作者: S. Ossowski , K. Schneeberger , R. M. Clark , C. Lanz , N. Warthmann
关键词:
摘要: Whole-genome hybridization studies have suggested that the nuclear genomes of accessions (natural strains) Arabidopsis thaliana can differ by several percent their sequence. To examine this variation, and as a first step in 1001 Genomes Project for species, we produced 15- to 25-fold coverage Illumina sequencing-by-synthesis (SBS) reads reference accession, Col-0, two divergent strains, Bur-0 Tsu-1. We aligned genome sequence assess data quality metrics detect polymorphisms. Alignments revealed 823,325 unique single nucleotide polymorphisms (SNPs) 79,961 1- 3-bp indels at specificity >99%, over 2000 potential errors also identified >3.4 Mb Tsu-1 being either extremely dissimilar, deleted, or duplicated relative genome. obtain sequences these regions, incorporated Velvet assembler into targeted de novo assembly method. This approach yielded 10,921 high-confidence contigs were anchored flanking harbored large 641 bp. Our methods are broadly applicable polymorphism discovery moderate even highly diverged loci, established subsampling SBS depth required inform broad range functional evolutionary studies. pipeline aligning predicting SNPs indels, SHORE, is available download http://1001genomes.org.