The DART classification of unannotated transcription within the ENCODE regions: Associating transcription with known and novel loci

作者: J. S. Rozowsky , D. Newburger , F. Sayward , J. Wu , G. Jordan

DOI: 10.1101/GR.5696007

关键词:

摘要: For the ∼1% of human genome in ENCODE regions, only about half transcriptionally active regions (TARs) identified with tiling microarrays correspond to annotated exons. Here we categorize this large amount “unannotated transcription.” We use a number disparate features classify 6988 novel TARs—array expression profiles across cell lines and conditions, sequence composition, phylogenetic (presence/absence syntenic conservation 17 species), locations relative genes. In classification, first filter out TARs unusual composition those likely resulting from cross-hybridization. then associate some remaining proximal exons having correlated profiles. Finally, cluster unclassified into putative loci, based on similar To encapsulate our construct Database Active Regions Tools (DART.gersteinlab.org). DART has special facilities for rapidly handling comparing many sets their heterogeneous features, synchronizing builds, interfacing other resources. Overall, find that ∼14% can be associated known genes, while ∼21% clustered ∼200 loci. observe genes are enriched potential form structural RNAs TAR clusters nearby promoters. benchmark design set experiments testing connectivity TARs. 18 46 connections tested validate by RT-PCR four five sequenced PCR products confirm unambiguously.

参考文章(34)
J. A. Hartigan, M. A. Wong, A K-Means Clustering Algorithm Journal of The Royal Statistical Society Series C-applied Statistics. ,vol. 28, pp. 100- 108 ,(1979) , 10.2307/2346830
Thomas E. Royce, Joel S. Rozowsky, Paul Bertone, Manoj Samanta, Viktor Stolc, Sherman Weissman, Michael Snyder, Mark Gerstein, Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping Trends in Genetics. ,vol. 21, pp. 466- 475 ,(2005) , 10.1016/J.TIG.2005.06.007
Thomas A Down, Tim JP Hubbard, Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Research. ,vol. 12, pp. 458- 461 ,(2002) , 10.1101/GR.216102
Philipp Kapranov, Simon E Cawley, Jorg Drenkow, Stefan Bekiranov, Robert L Strausberg, Stephen PA Fodor, Thomas R Gingeras, Large-scale transcriptional activity in chromosomes 21 and 22. Science. ,vol. 296, pp. 916- 919 ,(2002) , 10.1126/SCIENCE.1068597
Ron Edgar, Michael Domrachev, Alex E Lash, Gene Expression Omnibus: NCBI gene expression and hybridization array data repository Nucleic Acids Research. ,vol. 30, pp. 207- 210 ,(2002) , 10.1093/NAR/30.1.207
Antonio Piccolboni, Victor Sementchenko, Jill Cheng, Alan J Williams, Raymond Wheeler, Brant Wong, Jorg Drenkow, Mark Yamanaka, Sandeep Patel, Shane Brubaker, Hari Tammana, Gregg Helt, Kevin Struhl, Thomas R Gingeras, Simon Cawley, Stefan Bekiranov, Huck H Ng, Philipp Kapranov, Edward A Sekinger, Dione Kampa, Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell. ,vol. 116, pp. 499- 509 ,(2004) , 10.1016/S0092-8674(04)00127-8
S Altschula, Warren Gisha, Webb Millerb, E Meyersc, D Lipmana, None, Basic Local Alignment Search Tool Journal of Molecular Biology. ,vol. 215, pp. 403- 410 ,(1990) , 10.1016/S0022-2836(05)80360-2
Lei Li, Xiangfeng Wang, Viktor Stolc, Xueyong Li, Dongfen Zhang, Ning Su, Waraporn Tongprasit, Songgang Li, Zhukuan Cheng, Jun Wang, Xing Wang Deng, None, Genome-wide transcription analyses in rice using tiling microarrays Nature Genetics. ,vol. 38, pp. 124- 129 ,(2006) , 10.1038/NG1704
Stefan Washietl, Jakob S Pedersen, Jan O Korbel, Claudia Stocsits, Andreas R Gruber, Jörg Hackermüller, Jana Hertel, Manja Lindemeyer, Kristin Reiche, Andrea Tanzer, Catherine Ucla, Carine Wyss, Stylianos E Antonarakis, France Denoeud, Julien Lagarde, Jorg Drenkow, Philipp Kapranov, Thomas R Gingeras, Roderic Guigo, Michael Snyder, Mark B Gerstein, Alexandre Reymond, Ivo L Hofacker, Peter F Stadler, Structured RNAs in the ENCODE selected regions of the human genome Genome Research. ,vol. 17, pp. 852- 864 ,(2007) , 10.1101/GR.5650707
J Robert Manak, Sujit Dike, Victor Sementchenko, Philipp Kapranov, Frederic Biemar, Jeff Long, Jill Cheng, Ian Bell, Srinka Ghosh, Antonio Piccolboni, Thomas R Gingeras, None, Biological function of unannotated transcription during the early development of Drosophila melanogaster Nature Genetics. ,vol. 38, pp. 1151- 1158 ,(2006) , 10.1038/NG1875