作者: Nicola Palmieri , Viola Nolte , Anton Suvorov , Carolin Kosiol , Christian Schlötterer
DOI: 10.1371/JOURNAL.PONE.0046415
关键词: FlyBase : A Database of Drosophila Genes & Genomes 、 Biology 、 Genome project 、 Sequence analysis 、 Drosophila pseudoobscura 、 Expressed sequence tag 、 Annotation 、 Genetics 、 Computational biology 、 Genome 、 Intron
摘要: RNA-Seq is a powerful tool for the annotation of genomes, in particular identification isoforms and UTRs. Nevertheless, several software tools exist no standard strategy to obtain reliable yet established. We tested different combinations most commonly used reference-based alignment (TopHat, GSNAP) combination with two frequently assemblers (Cufflinks, Scripture) evaluated potential improve Drosophila pseudoobscura. While GSNAP maps higher proportion reads, TopHat resulted more accurate when Cufflinks. Scripture had lowest sensitivity. Interestingly, after subsampling same coverage TopHat, we find that both mappers have similar performance, implying advantage mainly an artifact lower coverage. Overall, observed low concordance among approaches at junction isoform levels. Using data from sexes adult strains D. pseudoobscura detected alternative splicing about 30% FlyBase multiple-exon genes. Moreover, extended boundaries 6523 genes (about 40%). annotated 669 new genes, 45% them evidence. Most are located on unassembled contigs, reflecting their incomplete annotation. Finally, identified 99 additional not represented current genome contigs pseudoobscura, probably due location genomic regions difficult assemble (e.g. heterochromatic regions).