作者: Joshua P Der , Michael S Barker , Norman J Wickett , Claude W dePamphilis , Paul G Wolf
关键词:
摘要: Because of their phylogenetic position and unique characteristics biology life cycle, ferns represent an important lineage for studying the evolution land plants. Large complex genomes in combined with absence economically species have been a barrier to development genomic resources. However, high throughput sequencing technologies are now being widely applied non-model species. We leveraged Roche 454 GS-FLX Titanium pyrosequencing platform gametophyte transcriptome bracken fern (Pteridium aquilinum) develop resources evolutionary studies. 681,722 quality adapter trimmed reads totaling 254 Mbp were assembled de novo into 56,256 sequences (i.e. unigenes) mean length 547.2 bp total assembly size 30.8 average read-depth coverage 7.0×. estimate that 87% complete has sequenced all transcripts tagged. 61.8% unigenes had blastx hits NCBI nr protein database, representing 22,596 best hits. The longest open reading frame 52.2% positive domain matches InterProScan searches. assigned 46.2% GO functional annotation 16.0% enzyme code annotation. Enzyme codes used retrieve color KEGG pathway maps. A comparative genomics approach revealed substantial proportion genes expressed gametophytes be shared across Arabidopsis, Selaginella Physcomitrella, identified number potentially novel genes. By comparing list Arabidopsis by blast gametophyte-specific taken from literature, we set conserved specific screened repetitive identify 548 potentially-amplifiable simple sequence repeat loci 689 transposable elements. This study is first comprehensive analysis represents scientific resource studies demonstrate utility high-throughput normalized cDNA library characterization gene discovery plant.