Distinguishing protein-coding and noncoding genes in the human genome

作者: M. Clamp , B. Fry , M. Kamal , X. Xie , J. Cuff

DOI: 10.1073/PNAS.0709013104

关键词:

摘要: Although the Human Genome Project was completed 4 years ago, catalog of human protein-coding genes remains a matter controversy. Current catalogs list total ≈24,500 putative genes. It is broadly suspected that large fraction these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence evolutionary conservation with mouse or dog. However, there currently scientific justification for excluding simply fail to conservation: alternative hypothesis most actually valid reflect gene innovation primate lineage loss other lineages. Here, we reject this carefully analyzing nonconserved ORFs—specifically, their properties primates. We vast majority random occurrences. The analysis yields, as by-product, major revision current catalogs, cutting number ≈20,500. Specifically, it suggests should be added only if clear an encoded protein. also provides principled methodology evaluating future proposed additions catalog. Finally, results indicate has been relatively little true mammalian

参考文章(18)
Guy Slater, Ewan Birney, Automated generation of heuristics for biological sequence comparison BMC Bioinformatics. ,vol. 6, pp. 31- 31 ,(2005) , 10.1186/1471-2105-6-31
L. G. Wilming, J. G. R. Gilbert, K. Howe, S. Trevanion, T. Hubbard, J. L. Harrow, The vertebrate genome annotation (Vega) database Nucleic Acids Research. ,vol. 36, pp. 459- 465 ,(2004) , 10.1093/NAR/GKM987
Madeline M. Wong, Linda K. Cox, John C. Chrivia, The Chromatin Remodeling Protein, SRCAP, Is Critical for Deposition of the Histone Variant H2A.Z at Promoters Journal of Biological Chemistry. ,vol. 282, pp. 26132- 26139 ,(2007) , 10.1074/JBC.M703418200
Ann S. Zweig, Donna Karolchik, Robert M. Kuhn, David Haussler, W. James Kent, UCSC genome browser tutorial. Genomics. ,vol. 92, pp. 75- 84 ,(2008) , 10.1016/J.YGENO.2008.02.003
N. Vinckenbosch, I. Dupanloup, H. Kaessmann, Evolutionary fate of retroposed gene copies in the human genome Proceedings of the National Academy of Sciences of the United States of America. ,vol. 103, pp. 3220- 3225 ,(2006) , 10.1073/PNAS.0511307103
Scott Schwartz, W James Kent, Arian Smit, Zheng Zhang, Robert Baertsch, Ross C Hardison, David Haussler, Webb Miller, Human–Mouse Alignments with BLASTZ Genome Research. ,vol. 13, pp. 103- 107 ,(2003) , 10.1101/GR.809403
Tracie Miner, William Nash, Christine Nguyen, Philip Ozersky, Kymberlie Pepin, Susan Rock, Tracy Rohlfing, Kelsi Scott, Brian Schultz, Cindy Strong, Aye Tin-Wollam, Shiaw-Pyng Yang, Robert H. Waterston, Richard K. Wilson, Steve Rozen, David C. Page, Helen Skaletsky, Tomoko Kuroda-Kawaguchi, Patrick J. Minx, Holland S. Cordum, LaDeana Hillier, Laura G. Brown, Sjoerd Repping, Tatyana Pyntikova, Johar Ali, Tamberlyn Bieri, Asif Chinwalla, Andrew Delehaunty, Kim Delehaunty, Hui Du, Ginger Fewell, Lucinda Fulton, Robert Fulton, Tina Graves, Shun-Fang Hou, Philip Latrielle, Shawn Leonard, Elaine Mardis, Rachel Maupin, John McPherson, The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature. ,vol. 423, pp. 825- 837 ,(2003) , 10.1038/NATURE01722
David Pilbeam, Nathan Young, Hominoid evolution: synthesizing disparate data Comptes Rendus Palevol. ,vol. 3, pp. 305- 321 ,(2004) , 10.1016/J.CRPV.2004.01.006
Pea Carninci, T Kasukawa, S Katayama, J Gough, MC Frith, Norihiro Maeda, Rieko Oyama, T Ravasi, B Lenhard, C Wells, Rimantas Kodzius, Kazuro Shimokawa, Vladimir B Bajic, SE Brenner, Serge Batalov, ARR Forrest, Mihaela Zavolan, MJ Davis, Laurens G Wilming, V Aidinis, JE Allen, A Ambesi-Impiombato, R Apweiler, Rajith N Aturaliya, TL Bailey, M Bansal, L Baxter, Kirk W Beisel, T Bersano, H Bono, Alistair M Chalk, KP Chiu, V Choudhary, A Christoffels, DR Clutterbuck, ML Crowe, E Dalla, BP Dalrymple, B De Bono, G Della Gatta, Diego di Bernardo, T Down, P Engstrom, M Fagiolini, G Faulkner, CF Fletcher, T Fukushima, M Furuno, S Futaki, M Gariboldi, P Georgii-Hemming, TR Gingeras, Takashi Gojobori, RE Green, S Gustincich, M Harbers, Yutaka Hayashi, TK Hensch, N Hirokawa, D Hill, L Huminiecki, M Iacono, Kazuho Ikeo, A Iwama, T Ishikawa, M Jakt, A Kanapin, Masaru Katoh, Y Kawasawa, Janet Kelso, H Kitamura, Hiroaki Kitano, G Kollias, SPT Krishnan, A Kruger, SK Kummerfeld, IV Kurochkin, LF Lareau, D Lazarevic, L Lipovich, J Liu, S Liuni, S McWilliam, M Madan Babu, M Madera, Luigi Marchionni, Hideo Matsuda, S Matsuzawa, H Miki, F Mignone, S Miyake, K Morris, Salim Mottagui-Tabar, N Mulder, N Nakano, H Nakauchi, P Ng, R Nilsson, S Nishiguchi, S Nishikawa, F Nori, O Ohara, Y Okazaki, V Orlando, KC Pang, WJ Pavan, G Pavesi, G Pesole, Nikolai Petrovsky, S Piazza, J Reed, JF Reid, Brian Z Ring, M Ringwald, B Rost, Y Ruan, Steven L Salzberg, A Sandelin, C Schneider, C Schonbach, K Sekiguchi, Colin AM Semple, S Seno, L Sessa, Y Sheng, Y Shibata, H Shimada, K Shimada, D Silva, B Sinclair, S Sperling, E Stupka, Koji Sugiura, R Sultana, Yoichi Takenaka, K Taki, K Tammoja, SL Tan, S Tang, MS Taylor, J Tegner, SA Teichmann, HR Ueda, E Van Nimwegen, R Verardo, CL Wei, K Yagi, H Yamanishi, E Zabarovsky, S Zhu, None, The Transcriptional Landscape of the Mammalian Genome Science. ,vol. 309, pp. 1559- 1563 ,(2005) , 10.1126/SCIENCE.1112014
M. F. Lin, J. W. Carlson, M. A. Crosby, B. B. Matthews, C. Yu, S. Park, K. H. Wan, A. J. Schroeder, L. S. Gramates, S. E. St. Pierre, M. Roark, K. L. Wiley, R. J. Kulathinal, P. Zhang, K. V. Myrick, J. V. Antone, S. E. Celniker, W. M. Gelbart, M. Kellis, Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. Genome Research. ,vol. 17, pp. 1823- 1836 ,(2007) , 10.1101/GR.6679507