Multivariate meta-analysis of proteomics data from human prostate and colon tumours

作者: Lina Rosenberg , Bo Franzén , Gert Auer , Janne Lehtiö , Jenny Forshed

DOI: 10.1186/1471-2105-11-468

关键词: Partial least squares regressionFeature selectionMeta-analysisDNA microarrayProteomicsBiologyBioinformaticsMissing dataCancerMultivariate statisticsComputational biology

摘要: There is a vast need to find clinically applicable protein biomarkers as support in cancer diagnosis and tumour classification. In proteomics research, number of methods can be used obtain systemic information on pathway level cells tissues. One fundamental tool analysing expression has been two-dimensional gel electrophoresis (2DE). Several 2DE studies have reported partially redundant lists differently expressed proteins. To able further extract valuable from existing data, the power multivariate meta-analysis will evaluated this work. We here demonstrate data human prostate colon tumours. developed bioinformatic workflow for identifying common patterns over two types. This included dealing with pre-processing handling missing values followed by development Partial Least Squares (PLS) model prediction variable selection. The selection was based variables performance PLS combination stability validation. rigorously using double cross-validation scheme. most stable bootstrap validation gave mean success 93% when predicting left out test sets models discriminating between normal tissue, analysis conducted study identified 14 proteins trend types colon, i.e. same profile samples. enabled finding malign types, which not possible identify separately.

参考文章(33)
J I Garrels, The QUEST system for quantitative analysis of two-dimensional gels. Journal of Biological Chemistry. ,vol. 264, pp. 5269- 5282 ,(1989) , 10.1016/S0021-9258(18)83728-0
Miroslav Kubat, Robert C. Holte, Stan Matwin, Machine Learning for the Detection of Oil Spills in Satellite Radar Images Machine Learning. ,vol. 30, pp. 195- 215 ,(1998) , 10.1023/A:1007452223027
Daniela Albrecht, Olaf Kniemeyer, Axel A. Brakhage, Reinhard Guthke, Missing values in gel-based proteomics. Proteomics. ,vol. 10, pp. 1202- 1211 ,(2010) , 10.1002/PMIC.200800576
Jiri Petrak, Robert Ivanek, Ondrej Toman, Radek Cmejla, Jana Cmejlova, Daniel Vyoral, Jan Zivny, Christopher D. Vulpe, Déjà vu in proteomics. A hit parade of repeatedly identified differentially expressed proteins Proteomics. ,vol. 8, pp. 1744- 1749 ,(2008) , 10.1002/PMIC.200700919
Ross Ihaka, Robert Gentleman, R: A Language for Data Analysis and Graphics Journal of Computational and Graphical Statistics. ,vol. 5, pp. 299- 314 ,(1996) , 10.1080/10618600.1996.10474713
Helena Lexander, Bo Franzén, Daniel Hirschberg, Susanne Becker, Magnus Hellström, Tomas Bergman, Hans Jörnvall, Gert Auer, Lars Egevad, Differential protein expression in anatomical zones of the prostate. Proteomics. ,vol. 5, pp. 2570- 2576 ,(2005) , 10.1002/PMIC.200401170
Emilio Marengo, Elisa Robotti, Marco Bobba, Alberto Milli, Natascia Campostrini, Sabina Carla Righetti, Daniela Cecconi, Pier Giorgio Righetti, Application of partial least squares discriminant analysis and variable selection procedures: a 2D-PAGE proteomic study. Analytical and Bioanalytical Chemistry. ,vol. 390, pp. 1327- 1342 ,(2008) , 10.1007/S00216-008-1837-Y
Romina Pedreschi, Maarten LATM Hertog, Sebastien C Carpentier, Jeroen Lammertyn, Johan Robben, Jean‐Paul Noben, Bart Panis, Rony Swennen, Bart M Nicolaï, None, Treatment of missing values for multivariate statistical analysis of gel-based proteomics data. Proteomics. ,vol. 8, pp. 1371- 1383 ,(2008) , 10.1002/PMIC.200700975
Jürgen Cox, Matthias Mann, None, Is Proteomics the New Genomics? Cell. ,vol. 130, pp. 395- 398 ,(2007) , 10.1016/J.CELL.2007.07.032
L. Hultin-Rosenberg, S. Jagannathan, K. C. Nilsson, S. A. Matis, N. Sjögren, R. D. J. Huby, A. H. Salter, J. D. Tugwood, Predictive models of hepatotoxicity using gene expression data from primary rat hepatocytes. Xenobiotica. ,vol. 36, pp. 1122- 1139 ,(2006) , 10.1080/00498250600861801