作者: Asaf A. Salamov , Victor V. Solovyev
DOI:
关键词: Gene prediction 、 Genome survey sequence 、 Coding region 、 Reference genome 、 Human genome 、 Comparative genomics 、 Genome project 、 Sequence analysis 、 Genetics 、 Biology
摘要: We present a complex of new programs for promoter, 3'-processing, splice sites, coding exons and gene structure identification in genomic DNA several model species. The human prediction program FGENEH, exon prediction-FEXH site prediction-HSPL have been modified sequence analysis Drosophila (FGENED, FEXD DSPL), C.elegance (FGENEN, FEXN NSPL), Yeast (FEXY YSPL) Plant (FGENEA, FEXA ASPL) sequences. recomputed all frequency discriminant function parameters these organisms adjusted organism specific minimal intron lengths. An accuracy region is similar with the observed FEXH FGENEH. developed FEXHB FGENEHB combining pattern recognition features information about similarity predicted known sequences protein databases. These approximately 10% higher average recognition. Two promoter (TSSG TSSW) which use Gosh (1993) Wingender (1994) data bases functional motifs, respectively. POLYAH was designed 3'-processing regions genes CDSB bacterial prediction. approach to predict multiple based on double dynamic programming, that very important long fragments generated by genome sequencing projects. Analysis uncharacterized our methods available through University Houston, Weizmann Institute Science email servers Web pages at Baylor College Medicine.