Model selection bias and Freedman’s paradox

作者: Paul M. Lukacs , Kenneth P. Burnham , David R. Anderson

DOI: 10.1007/S10463-009-0234-4

关键词: Feature selectionModel selectionAkaike information criterionMathematicsSpurious relationshipSelection biasStatisticsBayesian information criterionEstimatorEconometricsFreedman's paradox

摘要: In situations where limited knowledge of a system exists and the ratio data points to variables is small, variable selection methods can often be misleading. Freedman (Am Stat 37:152–155, 1983) demonstrated how common it select completely unrelated as highly “significant” when number similar in magnitude variables. A new type model averaging estimator based on with Akaike’s AIC used linear regression investigate problems likely inclusion spurious effects bias, bias introduced while using single seemingly “best” from (often large) set models employing many predictor The helps reduce these provides confidence interval coverage at nominal level traditional stepwise has poor inferential properties.

参考文章(27)
CLIFFORD M. HURVICH, CHIH-LING TSAI, Regression and time series model selection in small samples Biometrika. ,vol. 76, pp. 297- 307 ,(1989) , 10.1093/BIOMET/76.2.297
Alan J. Miller, Subset Selection in Regression ,(2002)
Matthew W. Wheeler, A. John Bailer, Comparing model averaging with other model selection strategies for benchmark dose estimation Environmental and Ecological Statistics. ,vol. 16, pp. 37- 51 ,(2009) , 10.1007/S10651-007-0071-7
S. T. Buckland, K. P. Burnham, N. H. Augustin, Model selection: An integral part of inference Biometrics. ,vol. 53, pp. 603- 618 ,(1997) , 10.2307/2533961
Edward I. George, Robert E. McCulloch, Variable Selection via Gibbs Sampling Journal of the American Statistical Association. ,vol. 88, pp. 881- 889 ,(1993) , 10.1080/01621459.1993.10476353
Fred S. Guthery, Kenneth P. Burnham, David R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach The Journal of Wildlife Management. ,vol. 67, pp. 655- ,(2003) , 10.2307/3802723
Clifford M. Hurvich, Chih—Ling Tsai, The Impact of Model Selection on Inference in Linear Regression The American Statistician. ,vol. 44, pp. 214- 217 ,(1990) , 10.1080/00031305.1990.10475722
David A. Freedman, David A. Freedman, A Note on Screening Regression Equations The American Statistician. ,vol. 37, pp. 152- 155 ,(1983) , 10.1080/00031305.1983.10482729
Hirotogu Akaike, Information Theory and an Extension of the Maximum Likelihood Principle international symposium on information theory. ,vol. 1, pp. 610- 624 ,(1973) , 10.1007/978-1-4612-1694-0_15