Exact Transcriptome Reconstruction from Short Sequence Reads

作者: Vincent Lacroix , Michael Sammeth , Roderic Guigo , Anne Bergeron

DOI: 10.1007/978-3-540-87361-7_5

关键词: GeneRNASet (abstract data type)Data miningReconstruction problemSequenceComplement (set theory)TranscriptomeBiologyComputational biologyLarge set (Ramsey theory)

摘要: In this paper we address the problem of characterizing RNA complement a given cell type, that is, set species and their relative copy number, from large short sequence reads which have been randomly sampled cell's sequences through sequencing experiment. We refer to as transcriptome reconstruction problem, specifically investigate, both theoretically practically, conditions under can be solved. demonstrate that, even assumption exact information, neither single read nor paired-end guarantee has unique solution. However, by investigating behavior best annotated human gene set, also show in practice, --- but not may sufficient solve vast majority transcript variants abundances. finally when assume existing are known, effectively used infer variant

参考文章(21)
Michael Sammeth, Gabriel Valiente, Roderic Guigó, Bubbles: Alternative Splicing Events of Arbitrary Dimension in Splicing Graphs Lecture Notes in Computer Science. pp. 372- 395 ,(2008) , 10.1007/978-3-540-78839-3_32
C.W. SUGNET, W.J. KENT, M. ARES, D. HAUSSLER, Transcriptome and genome conservation of alternative splicing events in humans and mice. pacific symposium on biocomputing. pp. 66- 77 ,(2003) , 10.1142/9789812704856_0007
D. Bellin, M. Werber, T. Theis, B. Schulz, B. Weisshaar, K. Schneider, EST Sequencing, Annotation and Macroarray Transcriptome Analysis Identify Preferentially Root-Expressed Genes in Sugar Beet Plant Biology. ,vol. 4, pp. 700- 710 ,(2002) , 10.1055/S-2002-37405
Barmak Modrek, Alissa Resch, Catherine Grasso, Christopher Lee, Genome-wide detection of alternative splicing in expressed sequences of human genes Nucleic Acids Research. ,vol. 29, pp. 2850- 2859 ,(2001) , 10.1093/NAR/29.13.2850
Michael Sammeth, Sylvain Foissac, Roderic Guigó, A General Definition and Nomenclature for Alternative Splicing Events PLOS Computational Biology. ,vol. 4, ,(2008) , 10.1371/JOURNAL.PCBI.1000147
Andreas P.M. Weber, Katrin L. Weber, Kevin Carr, Curtis Wilkerson, John B. Ohlrogge, Sampling the Arabidopsis Transcriptome with Massively Parallel Pyrosequencing Plant Physiology. ,vol. 144, pp. 32- 42 ,(2007) , 10.1104/PP.107.096677
M. Adams, J. Kelley, J. Gocayne, M. Dubnick, M. Polymeropoulos, H. Xiao, C. Merril, A. Wu, B. Olde, R. Moreno, a. et, Complementary DNA sequencing : expressed sequence tags and human genome project Science. ,vol. 252, pp. 1651- 1656 ,(1991) , 10.1126/SCIENCE.2047873
Simon T Bennett, Colin Barnes, Anthony Cox, Lisa Davies, Clive Brown, Toward the $1000 human genome Pharmacogenomics. ,vol. 6, pp. 373- 382 ,(2005) , 10.1517/14622416.6.4.373
Yi Xing, Alissa Resch, Christopher Lee, The Multiassembly Problem: Reconstructing Multiple Transcript Isoforms From EST Fragment Mixtures Genome Research. ,vol. 14, pp. 426- 441 ,(2004) , 10.1101/GR.1304504
Management Group Liefer Laura A. 51 Wetterstrand Kris A. 51 Good Peter J. 51 Feingold Elise A. 51 Guyer Mark S. 51 Collins Francis S. 52, Baylor College of Medicine Human Genome Sequencing Center*, Washington University Genome Sequencing Center*, Broad Institute*, Children’s Hospital Oakland Research Institute*, Mark Gerstein, Stylianos E Antonarakis, Serafim Batzoglou, Nick Goldman, Ross C Hardison, David Haussler, Webb Miller, Lior Pachter, Eric D Green, Arend Sidow, Zhiping Weng, Nathan D Trinklein, Yutao Fu, Zhengdong D Zhang, Ulaş Karaöz, Leah Barrera, Rhona Stuart, Deyou Zheng, Srinka Ghosh, Paul Flicek, David C King, James Taylor, Adam Ameur, Stefan Enroth, Mark C Bieda, Christoph M Koch, Heather A Hirsch, Chia-Lin Wei, Jill Cheng, Jonghwan Kim, Akshay A Bhinge, Paul G Giresi, Nan Jiang, Jun Liu, Fei Yao, Wing-Kin Sung, Kuo Ping Chiu, Vinsensius B Vega, Charlie WH Lee, Patrick Ng, Atif Shahab, Edward A Sekinger, Annie Yang, Zarmik Moqtaderi, Zhou Zhu, Xiaoqin Xu, Sharon Squazzo, Matthew J Oberley, David Inman, Michael A Singer, Todd A Richmond, Kyle J Munn, Alvaro Rada-Iglesias, Ola Wallerman, Jan Komorowski, Gayle K Clelland, Sarah Wilcox, Shane C Dillon, Robert M Andrews, Joanna C Fowler, Phillippe Couttet, Keith D James, Gregory C Lefebvre, Alexander W Bruce, Oliver M Dovey, Peter D Ellis, Pawandeep Dhami, Cordelia F Langford, Nigel P Carter, David Vetrie, Philipp Kapranov, David A Nix, Ian Bell, Sandeep Patel, Joel Rozowsky, Ghia Euskirchen, Stephen Hartman, Jin Lian, Jiaqian Wu, Alexander E Urban, Peter Kraus, Sara Van Calcar, Nate Heintzman, Tae Hoon Kim, Kun Wang, Chunxu Qu, Gary Hon, Rosa Luna, Christopher K Glass, M Geoff Rosenfeld, Shelley Force Aldred, Sara J Cooper, Anason Halees, Jane M Lin, Hennady P Shulha, Xiaoling Zhang, Mousheng Xu, Jaafar NS Haidar, Yong Yu, Ewan Birney*, Sherman Weissman, Yijun Ruan, Jason D Lieb, Vishwanath R Iyer, Roland D Green, Thomas R Gingeras, Claes Wadelius, Ian Dunham, Kevin Struhl, Ross C Hardison, Mark Gerstein, Peggy J Farnham, Richard M Myers, Bing Ren, Michael Snyder, Daryl J Thomas, Kate Rosenbloom, Rachel A Harte, Angie S Hinrichs, Heather Trumbower, Hiram Clawson, Jennifer Hillman-Jackson, Ann S Zweig, Kayla Smith, Archana Thakkapallayil, Galt Barber, Robert M Kuhn, Donna Karolchik, David Haussler, W James Kent, Emmanouil T Dermitzakis, Lluis Armengol, Christine P Bird, Taane G Clark, Gregory M Cooper, Paul IW de Bakker, Andrew D Kern, Nuria Lopez-Bigas, Joel D Martin, Barbara E Stranger, None, Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project Nature. ,vol. 447, pp. 799- 816 ,(2007) , 10.1038/NATURE05874