作者: Lina Rosenberg , Bo Franzén , Gert Auer , Janne Lehtiö , Jenny Forshed
关键词: Partial least squares regression 、 Feature selection 、 Meta-analysis 、 DNA microarray 、 Proteomics 、 Biology 、 Bioinformatics 、 Missing data 、 Cancer 、 Multivariate statistics 、 Computational biology
摘要: There is a vast need to find clinically applicable protein biomarkers as support in cancer diagnosis and tumour classification. In proteomics research, number of methods can be used obtain systemic information on pathway level cells tissues. One fundamental tool analysing expression has been two-dimensional gel electrophoresis (2DE). Several 2DE studies have reported partially redundant lists differently expressed proteins. To able further extract valuable from existing data, the power multivariate meta-analysis will evaluated this work. We here demonstrate data human prostate colon tumours. developed bioinformatic workflow for identifying common patterns over two types. This included dealing with pre-processing handling missing values followed by development Partial Least Squares (PLS) model prediction variable selection. The selection was based variables performance PLS combination stability validation. rigorously using double cross-validation scheme. most stable bootstrap validation gave mean success 93% when predicting left out test sets models discriminating between normal tissue, analysis conducted study identified 14 proteins trend types colon, i.e. same profile samples. enabled finding malign types, which not possible identify separately.