An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm.

作者: Biswanath Chowdhury , Arnav Garai , Gautam Garai

DOI: 10.1186/S12859-017-1874-7

关键词:

摘要: Detection of important functional and/or structural elements and identification their positions in a large eukaryotic genomic sequence are an active research area. Gene is unit DNA. The computation gene prediction is, therefore, very essential for detailed genome annotation. In this paper, we propose new technique based on Genetic Algorithm (GA) to determine the optimal exons chromosome or genome. correct coding non-coding regions difficult computationally demanding. proposed genetic-based method, named Prediction with (GPGA), reduces problem by searching only one exon at time instead all along its introns. This representation carries significant advantage that it breaks entire gene-finding into number smaller sub-problems, thereby reducing computational complexity. We tested performance GPGA existing benchmark datasets compared results well-known relevant techniques. comparison shows better comparable method. also used annotating human 21 (HS21) using cross-species comparisons mouse orthologs. It was noted predicted true genes accuracy than other approaches.

参考文章(60)
Mousa Shamsi, Hamidreza Saberkari, Hamed Heravi, MohammadHossein Sedaaghi, A fast algorithm for exonic regions prediction in DNA sequences. Journal of medical signals and sensors. ,vol. 3, pp. 139- 149 ,(2013) , 10.4103/2228-7477.120977
Todd Richmond, Gene recognition via spliced alignment Genome Biology. ,vol. 1, pp. 1- 4 ,(2000) , 10.1186/GB-2000-1-1-REPORTS233
Larry J. Eshelman, J. David Schaffer, Real-Coded Genetic Algorithms and Interval-Schemata foundations of genetic algorithms. ,vol. 2, pp. 187- 202 ,(1993) , 10.1016/B978-0-08-094832-4.50018-0
Anders Krogh, Two Methods for Improving Performance of a HMM and their Application for Gene Finding intelligent systems in molecular biology. ,vol. 5, pp. 179- 186 ,(1997)
Sarah J. Wheelan, Deanna M. Church, James M. Ostell, Spidey: A Tool for mRNA-to-Genomic Alignments Genome Research. ,vol. 11, pp. 1952- 1957 ,(2001) , 10.1101/GR.195301
Roderic Guigo, Pankaj Agarwal, Josep F Abril, Moisés Burset, James W Fickett, An Assessment of Gene Prediction Accuracy in Large DNA Sequences Genome Research. ,vol. 10, pp. 1631- 1642 ,(2000) , 10.1101/GR.122800
Masruba Tasnim, Shining Ma, Ei-Wen Yang, Tao Jiang, Wei Li, None, Accurate inference of isoforms from multiple sample RNA-Seq data BMC Genomics. ,vol. 16, pp. 1- 12 ,(2015) , 10.1186/1471-2164-16-S2-S15
Doori Park, Je Won Jung, Beom-Soon Choi, Murukarthick Jayakodi, Jeongsoo Lee, Jongsung Lim, Yeisoo Yu, Yong-Soo Choi, Myeong-Lyeol Lee, Yoonseong Park, Ik-Young Choi, Tae-Jin Yang, Owain R Edwards, Gyoungju Nah, Hyung Wook Kwon, Uncovering the novel characteristics of Asian honey bee, Apis cerana, by whole genome sequencing BMC Genomics. ,vol. 16, pp. 1- 16 ,(2015) , 10.1186/1471-2164-16-1
Moisès Burset, Roderic Guigó, Evaluation of Gene Structure Prediction Programs Genomics. ,vol. 34, pp. 353- 367 ,(1996) , 10.1006/GENO.1996.0298