Transcript length bias in RNA-seq data confounds systems biology

作者: Alicia Oshlack , Matthew J Wakefield

DOI: 10.1186/1745-6150-4-14

关键词: Transcription (biology)Systems biologyComputational biologyRNA-SeqGenomeBiologyGeneticsGene expression profilingDeep sequencingGeneTranscriptome

摘要: Several recent studies have demonstrated the effectiveness of deep sequencing for transcriptome analysis (RNA-seq) in mammals. As RNA-seq becomes more affordable, whole genome transcriptional profiling is likely to become platform choice species with good genomic sequences. yet, a rigorous methodology has not been developed and we are still stages exploring features data. We investigated effect transcript length bias data using three different published sets. For standard analyses aggregated tag counts each gene, ability call differentially expressed genes between samples strongly associated transcript. Transcript calling general feature current protocols technology. This implications ranking genes, particular may introduce gene set testing pathway other multi-gene systems biology analyses. article was reviewed by Rohan Williams (nominated Gavin Huttley), Nicole Cloonan Mark Ragan) James Bullard Sandrine Dudoit).

参考文章(11)
Mark J Dunning, Nuno L Barbosa-Morais, Andy G Lynch, Simon Tavaré, Matthew E Ritchie, Statistical issues in the analysis of Illumina data BMC Bioinformatics. ,vol. 9, pp. 85- 85 ,(2008) , 10.1186/1471-2105-9-85
Gordon K Smyth, Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments Statistical Applications in Genetics and Molecular Biology. ,vol. 3, pp. 1- 25 ,(2004) , 10.2202/1544-6115.1027
M. Sultan, M. H. Schulz, H. Richard, A. Magen, A. Klingenhoff, M. Scherf, M. Seifert, T. Borodina, A. Soldatov, D. Parkhomchuk, D. Schmidt, S. O'Keeffe, S. Haas, M. Vingron, H. Lehrach, M.-L. Yaspo, A Global View of Gene Activity and Alternative Splicing by Deep Sequencing of the Human Transcriptome Science. ,vol. 321, pp. 956- 960 ,(2008) , 10.1126/SCIENCE.1160342
Zhijin Wu, Rafael A. Irizarry, Stochastic models inspired by hybridization theory for short oligonucleotide arrays. Journal of Computational Biology. ,vol. 12, pp. 882- 893 ,(2005) , 10.1089/CMB.2005.12.882
Nicole Cloonan, Alistair R R Forrest, Gabriel Kolle, Brooke B A Gardiner, Geoffrey J Faulkner, Mellissa K Brown, Darrin F Taylor, Anita L Steptoe, Shivangi Wani, Graeme Bethel, Alan J Robertson, Andrew C Perkins, Stephen J Bruce, Clarence C Lee, Swati S Ranade, Heather E Peckham, Jonathan M Manning, Kevin J McKernan, Sean M Grimmond, Stem cell transcriptome profiling via massive-scale mRNA sequencing Nature Methods. ,vol. 5, pp. 613- 619 ,(2008) , 10.1038/NMETH.1223
J. C. Marioni, C. E. Mason, S. M. Mane, M. Stephens, Y. Gilad, RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays Genome Research. ,vol. 18, pp. 1509- 1517 ,(2008) , 10.1101/GR.079558.108
Da Wei Huang, Brad T. Sherman, Richard A. Lempicki, Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists Nucleic Acids Research. ,vol. 37, pp. 1- 13 ,(2009) , 10.1093/NAR/GKN923
Stéphane Audic, Jean-Michel Claverie, The Significance of Digital Gene Expression Profiles Genome Research. ,vol. 7, pp. 986- 995 ,(1997) , 10.1101/GR.7.10.986
Eric T. Wang, Rickard Sandberg, Shujun Luo, Irina Khrebtukova, Lu Zhang, Christine Mayr, Stephen F. Kingsmore, Gary P. Schroth, Christopher B. Burge, Alternative Isoform Regulation in Human Tissue Transcriptomes Nature. ,vol. 456, pp. 470- 476 ,(2008) , 10.1038/NATURE07509
Da Wei Huang, Brad T Sherman, Richard A Lempicki, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols. ,vol. 4, pp. 44- 57 ,(2009) , 10.1038/NPROT.2008.211