Optimal Scaling of Digital Transcriptomes

作者: Gustavo Glusman , Juan Caballero , Max Robinson , Burak Kutlu , Leroy Hood

DOI: 10.1371/JOURNAL.PONE.0077885

关键词:

摘要: Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels thousands genes to be compared across multiple samples. Since transcript counts scale with depth, from different samples must normalized a common prior comparison. We analyzed fifteen existing and novel algorithms normalizing counts, evaluated the effectiveness resulting normalizations. For this purpose we defined two mutually independent metrics: (1) number “uniform” (genes whose have sufficiently low coefficient variation), (2) Spearman correlation between profiles gene pairs. also define four algorithms, one which explicitly maximizes uniform genes, performance all algorithms. The most commonly used methods (scaling fixed total value, or equalizing certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected sets. Conversely, seven approached what appears optimal normalization. Three these rely identification “ubiquitous” genes: expressed in studied, but never at very high levels. demonstrate that include “core” many tissues consistent pattern, is suitable use as internal guide. new yield robustly values, prerequisite differentially tissue-specific potential biomarkers.

参考文章(36)
Hans-Georg Beyer, Hans-Paul Schwefel, Evolution strategies –A comprehensive introduction Natural Computing. ,vol. 1, pp. 3- 52 ,(2002) , 10.1023/A:1015059928466
Jo Vandesompele, Katleen De Preter, Filip Pattyn, Bruce Poppe, Nadine Van Roy, Anne De Paepe, Frank Speleman, Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes Genome Biology. ,vol. 3, pp. 1- 12 ,(2002) , 10.1186/GB-2002-3-7-RESEARCH0034
Jessica C Mar, Yasumasa Kimura, Kate Schroder, Katharine M Irvine, Yoshihide Hayashizaki, Harukazu Suzuki, David Hume, John Quackenbush, Data-driven normalization strategies for high-throughput quantitative RT-PCR BMC Bioinformatics. ,vol. 10, pp. 110- 110 ,(2009) , 10.1186/1471-2105-10-110
V. E. Velculescu, L. Zhang, B. Vogelstein, K. W. Kinzler, Serial analysis of gene expression Science. ,vol. 270, pp. 484- 487 ,(2000) , 10.1126/SCIENCE.270.5235.484
Alina Sîrbu, Heather J. Ruskin, Martin Crane, Cross-Platform Microarray Data Normalisation for Regulatory Network Inference PLoS ONE. ,vol. 5, pp. e13822- ,(2010) , 10.1371/JOURNAL.PONE.0013822
Jiang Zhu, Fuhong He, Shuhui Song, Jing Wang, Jun Yu, How many human genes can be defined as housekeeping with current expression data BMC Genomics. ,vol. 9, pp. 172- 172 ,(2008) , 10.1186/1471-2164-9-172
Soohyun Lee, Chae Hwa Seo, Byungho Lim, Jin Ok Yang, Jeongsu Oh, Minjin Kim, Sooncheol Lee, Byungwook Lee, Changwon Kang, Sanghyuk Lee, None, Accurate quantification of transcriptome from RNA-Seq data by effective length normalization Nucleic Acids Research. ,vol. 39, pp. 9- ,(2011) , 10.1093/NAR/GKQ1015
Sydney Brenner, Maria Johnson, John Bridgham, George Golda, David H. Lloyd, Davida Johnson, Shujun Luo, Sarah McCurdy, Michael Foy, Mark Ewan, Rithy Roth, Dave George, Sam Eletr, Glenn Albrecht, Eric Vermaas, Steven R. Williams, Keith Moon, Timothy Burcham, Michael Pallas, Robert B. DuBridge, James Kirchner, Karen Fearon, Jen-i Mao, Kevin Corcoran, Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays Nature Biotechnology. ,vol. 18, pp. 630- 634 ,(2000) , 10.1038/76469
Alexander J. Hartemink, David K. Gifford, Tommi S. Jaakkola, Richard A. Young, Maximum likelihood estimation of optimal scaling factors for expression array normalization Microarrays : optical technologies and informatics. Conference. ,vol. 4266, pp. 132- 140 ,(2001) , 10.1117/12.427981
Blake C Meyers, Shivakundan Singh Tej, Tam H Vu, Christian D Haudenschild, Vikas Agrawal, Steve B Edberg, Hassan Ghazal, Shannon Decola, The Use of MPSS for Whole-Genome Transcriptional Analysis in Arabidopsis Genome Research. ,vol. 14, pp. 1641- 1653 ,(2004) , 10.1101/GR.2275604