Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes.

作者: M. F. Lin , P. Kheradpour , S. Washietl , B. J. Parker , J. S. Pedersen

DOI: 10.1101/GR.108753.110

关键词: ENCODEGenetic codeSynonymous substitutionSequence alignmentGeneBiologyGeneticsComputational biologyGenomeNegative selectionConserved sequence

摘要: The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both additional functions have evolved under purifying selection should show increased evolutionary conservation compared typical genes--especially at synonymous sites. In this study, we use genome alignments 29 placental mammals systematically locate short regions within human ORFs that conspicuously low estimated rates substitution across these species. 29-species alignment provides statistical power more than 10,000 such with resolution down nine-codon windows, are found a quarter all genes contain ∼2% their We collect numerous lines evidence observed constraint reflects on elements including splicing regulatory elements, dual-coding genes, secondary structures, microRNA target sites, developmental enhancers. Our results common mammalian despite vast genomic landscape.

参考文章(107)
Susan M. Rueter, T. Renee Dawson, Ronald B. Emeson, Regulation of alternative splicing by RNA editing Nature. ,vol. 399, pp. 75- 80 ,(1999) , 10.1038/19992
Daniel Yekutieli, Yoav Benjamini, THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY Annals of Statistics. ,vol. 29, pp. 1165- 1188 ,(2001) , 10.1214/AOS/1013699998
Michael McKeown, Regulation of Alternative Splicing Genetic engineering. ,vol. 12, pp. 139- 181 ,(1990) , 10.1007/978-1-4613-0641-2_9
StefanM Stanley, TimothyL Bailey, JohnS Mattick, GONOME: measuring correlations between GO terms and genomic positions. BMC Bioinformatics. ,vol. 7, pp. 94- 94 ,(2006) , 10.1186/1471-2105-7-94
Ivo L. Hofacker, Sven Findeiß, Stefan Washietl, Andreas R. Gruber, Peter F. Stadler, RNAz 2.0: improved noncoding RNA detection. pacific symposium on biocomputing. pp. 69- 79 ,(2010)
Thomas Down, Bernard Leong, Tim JP Hubbard, A machine learning strategy to identify candidate binding sites in human protein-coding sequence. BMC Bioinformatics. ,vol. 7, pp. 419- 419 ,(2006) , 10.1186/1471-2105-7-419
Nick Goldman, Ziheng Yang, A codon-based model of nucleotide substitution for protein-coding DNA sequences. Molecular Biology and Evolution. ,vol. 11, pp. 725- 736 ,(1994) , 10.1093/OXFORDJOURNALS.MOLBEV.A040153
J. D. Storey, R. Tibshirani, Statistical significance for genomewide studies Proceedings of the National Academy of Sciences of the United States of America. ,vol. 100, pp. 9440- 9445 ,(2003) , 10.1073/PNAS.1530509100
Georgina Lang, Wendy M. Gombert, Hannah J. Gould, A transcriptional regulatory element in the coding sequence of the human Bcl‐2 gene Immunology. ,vol. 114, pp. 25- 36 ,(2005) , 10.1111/J.1365-2567.2004.02073.X