作者: Nicole L Quinn , Natasha Levenkova , William Chow , Pascal Bouffard , Keith A Boroevich
关键词:
摘要: With a whole genome duplication event and wealth of biological data, salmonids are excellent model organisms for studying evolutionary processes, fates duplicated genes genetic physiological processes associated with complex behavioral phenotypes. It is surprising therefore, that no salmonid has been sequenced. Atlantic salmon (Salmo salar) good representative sequencing given its importance in aquaculture the genomic resources available. However, size complexity combined lack sequenced reference from closely related fish makes assembly challenging. Given cost time limitations Sanger as well recent improvements to next generation technologies, we examined feasibility using Genome Sequencer (GS) FLX pyrosequencing system obtain sequence genome. Eight pooled BACs belonging minimum tiling path covering ~1 Mb were by GS shotgun Long Paired End compared ninth BAC library. An initial only sequences (average read length 248.5 bp) ~30× coverage allowed gene identification, but was incomplete even when 126 Sanger-generated BAC-end (~0.09× coverage) incorporated. The addition paired end reads (additional ~26× produced final comprising 175 contigs assembled into four scaffolds 171 gaps. (~10.5× nine two scaffolds. number comparable sequencing; however, gaps much higher assembly. These results represent first use de novo Our data demonstrated this improved assemblies; respect genomes, technology limited mining establishing set ordered contigs. Currently, sequence, it appears substantial portion should be done technology.