Supervised Methods with Genomic Data: a Review and Cautionary View

作者: Ramón Díaz-Uriarte

DOI: 10.1002/0470094419.CH12

关键词:

摘要: We review well accepted methods to address questions about differential expression of genes and class prediction from gene data. highlight some new topics that deserve more attention: testing specific groups genes, intra-group heterogeneity prediction, interaction in predictors, visualisation, difficulties the biological interpretation predictor molecular signatures, use ROC[Receiver Operating Characteristic curve]-based statistics for evaluating predictors expression. end with a serious problems can limit potential these methods; we focus specially on inadequate assessment performance (due estimation error rates few “easy” data sets) failure recognise observational studies include needed covariates. A final comment is made need freely available source code.

参考文章(90)
Yvonne E Pittelkow, Susan R Wilson, Visualisation of Gene Expression Data - the GE-biplot, the Chip-plot and the Gene-plot Statistical Applications in Genetics and Molecular Biology. ,vol. 2, pp. 1- 19 ,(2003) , 10.2202/1544-6115.1019
Richard M. Simon, Kevin Dobbin, Experimental design of DNA microarray experiments. BioTechniques. ,vol. 34, pp. 16- 21 ,(2003) , 10.2144/MAR03SIMON
Geraldine M. O’Neill, Daniel R. Catchpoole, Erica A. Golemis, From correlation to causality: microarrays, cancer, and cancer treatment. BioTechniques. ,vol. 34, pp. 64- 71 ,(2003) , 10.2144/MAR03ONEIL
Mario Peruggia, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.) Journal of the American Statistical Association. ,vol. 98, pp. 778- 779 ,(2003)
M. A. Martin, P. H. Westfall, S. S. Young, Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment. Biometrics. ,vol. 50, pp. 1226- ,(1994) , 10.2307/2533464
Leo Breiman, Random Forests Machine Learning archive. ,vol. 45, pp. 5- 32 ,(2001) , 10.1023/A:1010933404324
Leo Breiman, Stacked regressions Machine Learning archive. ,vol. 24, pp. 49- ,(1996) , 10.1023/A:1018046112532
Ian Jolliffe, Principal Component Analysis Encyclopedia of Statistics in Behavioral Science. ,(2005) , 10.1002/0470013192.BSA501
Giovanni Parmigiani, Elizabeth S. Garrett, Rafael A. Irizarry, Scott L. Zeger, The Analysis of Gene Expression Data: An Overview of Methods and Software Statistics for Biology and Health. pp. 1- 45 ,(2003) , 10.1007/0-387-21679-0_1