Fridge: Focused fine-tuning of ridge regression for personalized predictions.

作者: Kristoffer H. Hellton , Nils Lid Hjort

DOI: 10.1002/SIM.7576

关键词: OracleParametric statisticsRidge (differential geometry)Logistic regressionRegressionLinear regressionFocused information criterionAlgorithmCovariateComputer science

摘要: Statistical prediction methods typically require some form of fine-tuning tuning parameter(s), with K-fold cross-validation as the canonical procedure. For ridge regression, there exist numerous procedures, but common for all, including cross-validation, is that one single parameter chosen all future predictions. We propose instead to calculate a unique each individual which we wish predict an outcome. This generates individualized by focusing on vector covariates specific individual. The focused ridge-fridge-procedure introduced 2-part contribution: First define oracle minimizing mean squared error covariate vector, and then estimate this using plug-in estimates regression coefficients variance parameter. procedure extended logistic parametric bootstrap. high-dimensional data, use estimate, simulations show fridge gives smaller average than both simulated real data. illustrate new concept linear models in 2 applications personalized medicine: predicting risk treatment response based gene expression method implemented R package fridge.

参考文章(25)
Rosa J. Meijer, Jelle J. Goeman, Efficient approximate k-fold and leave-one-out cross-validation for ridge regression Biometrical Journal. ,vol. 55, pp. 141- 155 ,(2013) , 10.1002/BIMJ.201200088
Linn Cecilie Bergersen, Ingrid K. Glad, Heidi Lyng, Weighted lasso with data integration. Statistical Applications in Genetics and Molecular Biology. ,vol. 10, pp. 39- ,(2011) , 10.2202/1544-6115.1703
Mark A. van de Wiel, Tonje G. Lien, Wina Verlaat, Wessel N. van Wieringen, Saskia M. Wilting, Better prediction by use of co-data: adaptive group-regularized ridge regression Statistics in Medicine. ,vol. 35, pp. 368- 381 ,(2016) , 10.1002/SIM.6732
Minh Ngoc Tran, Penalized Maximum Likelihood Principle for Choosing Ridge Parameter Communications in Statistics - Simulation and Computation. ,vol. 38, pp. 1610- 1624 ,(2009) , 10.1080/03610910903061014
Nancy Jo Delaney, Sangit Chatterjee, Use of the Bootstrap and Cross-Validation in Ridge Regression Journal of Business & Economic Statistics. ,vol. 4, pp. 255- 262 ,(1986) , 10.1080/07350015.1986.10509520
Gene H. Golub, Michael Heath, Grace Wahba, Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter Technometrics. ,vol. 21, pp. 215- 223 ,(1979) , 10.1080/00401706.1979.10489751
J. F. Lawless, Mean Squared Error Properties of Generalized Ridge Estimators Journal of the American Statistical Association. ,vol. 76, pp. 462- 466 ,(1981) , 10.1080/01621459.1981.10477668
Sylvia Moeckel, Katharina Meyer, Petra Leukel, Fabian Heudorfer, Corinna Seliger, Christina Stangl, Ulrich Bogdahn, Martin Proescholdt, Alexander Brawanski, Arabel Vollmann-Zwerenz, Markus J. Riemenschneider, Anja-Katrin Bosserhoff, Rainer Spang, Peter Hau, Response-Predictive Gene Expression Profiling of Glioma Progenitor Cells In Vitro PLoS ONE. ,vol. 9, pp. e108632- ,(2014) , 10.1371/JOURNAL.PONE.0108632
H.M. Bovelstad, S. Nygard, H.L. Storvold, M. Aldrin, O. Borgan, A. Frigessi, O.C. Lingjaerde, Predicting survival from microarray data—a comparative study Bioinformatics. ,vol. 23, pp. 2080- 2087 ,(2007) , 10.1093/BIOINFORMATICS/BTM305
Omer Weissbrod, Dan Geiger, Genetic Linkage Analysis in the Presence of Germline Mosaicism Statistical Applications in Genetics and Molecular Biology. ,vol. 10, pp. 1- 26 ,(2011) , 10.2202/1544-6115.1709