Model Selection in High-Dimensional Misspecified Models

作者: Jinchi Lv , Yang Feng , Pallavi Basu

DOI:

关键词:

摘要: Model selection is indispensable to high-dimensional sparse modeling in selecting the best set of covariates among a sequence candidate models. Most existing work assumes implicitly that model correctly specified or fixed dimensions. Yet misspecification and high dimensionality are common real applications. In this paper, we investigate two classical Kullback-Leibler divergence Bayesian principles setting misspecified Asymptotic expansions these reveal effect crucial should be taken into account, leading generalized AIC BIC With natural choice prior probabilities, suggest with probability which involves logarithmic factor penalizing complexity. We further establish consistency covariance contrast matrix estimator general setting. Our results new method supported by numerical studies.

参考文章(32)
M. Stone, An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike's Criterion Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 39, pp. 44- 47 ,(1977) , 10.1111/J.2517-6161.1977.TB01603.X
S. Kullback, R. A. Leibler, On Information and Sufficiency Annals of Mathematical Statistics. ,vol. 22, pp. 79- 86 ,(1951) , 10.1214/AOMS/1177729694
Dean P. Foster, Edward I. George, The risk inflation criterion for multiple regression Annals of Statistics. ,vol. 22, pp. 1947- 1975 ,(1994) , 10.1214/AOS/1176325766
Yiyun Zhang, Runze Li, Chih-Ling Tsai, Regularization Parameter Selections via Generalized Information Criterion Journal of the American Statistical Association. ,vol. 105, pp. 312- 323 ,(2010) , 10.1198/JASA.2009.TM08013
Halbert White, Maximum likelihood estimation of misspecified models Econometrica. ,vol. 50, pp. 1- 25 ,(1982) , 10.2307/1912004
Leming Shi, Gregory Campbell, Wendell D Jones, Fabien Campagne, Zhining Wen, Stephen J Walker, Zhenqiang Su, Tzu-Ming Chu, Federico M Goodsaid, Lajos Pusztai, JD Shaughnessy Jr, Andr? Oberthuer, Russell S Thomas, Richard S Paules, Mark Fielden, Bart Barlogie, Weijie Chen, Pan Du, Matthias Fischer, Cesare Furlanello, Brandon D Gallas, Xijin Ge, Dalila B Megherbi, W Fraser Symmans, May D Wang, John Zhang, Hans Bitter, Benedikt Brors, Pierre R Bushel, Max Bylesjo, Minjun Chen, Jie Cheng, J Chou, TS Davison, M Delorenzi, Y Deng, V Devanarayan, DJ Dix, J Dopazo, KC Dorff, F Elloumi, J Fan, S Fan, X Fan, H Fang, N Gonzaludo, KR Hess, H Hong, J Huan, RA Irizarry, R Judson, D Juraeva, S Lababidi, CG Lambert, L Li, Y Li, Z Li, SM Lin, G Liu, EK Lobenhofer, J Luo, W Luo, MN McCall, Y Nikolsky, GA Pennello, RG Perkins, R Philip, V Popovici, ND Price, F Qian, A Scherer, T Shi, W Shi, J Sung, D Thierry-Mieg, J Thierry-Mieg, V Thodima, J Trygg, L Vishnuvajjala, SJ Wang, J Wu, Y Wu, Q Xie, WA Yousef, L Zhang, X Zhang, S Zhong, Y Zhou, S Zhu, D Arasappan, W Bao, AB Lucas, F Berthold, RJ Brennan, A Buness, JG Catalano, C Chang, R Chen, Y Cheng, J Cui, W Czika, F Demichelis, X Deng, D Dosymbekov, R Eils, Y Feng, J Fostel, S Fulmer-Smentek, JC Fuscoe, L Gatto, W Ge, DR Goldstein, L Guo, DN Halbert, J Han, SC Harris, C Hatzis, D Herman, J Huang, RV Jensen, R Jiang, CD Johnson, G Jurman, Y Kahlert, SA Khuder, M Kohl, J Li, M Li, QZ Li, S Li, J Liu, Y Liu, Z Liu, L Meng, M Madera, F Martinez-Murillo, I Medina, J Meehan, K Miclaus, RA Moffitt, D Montaner, P Mukherjee, GJ Mulligan, P Neville, T Nikolskaya, B Ning, GP Page, J Parker, RM Parry, X Peng, The Microarray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models Nature Biotechnology. ,vol. 28, pp. 827- 838 ,(2010) , 10.1038/NBT.1665
Mark Rudelson, Roman Vershynin, Hanson-Wright inequality and sub-gaussian concentration Electronic Communications in Probability. ,vol. 18, pp. 1- 9 ,(2013) , 10.1214/ECP.V18-2865
Hansheng Wang, Bo Li, Chenlei Leng, Shrinkage tuning parameter selection with a diverging number of parameters Journal of The Royal Statistical Society Series B-statistical Methodology. ,vol. 71, pp. 671- 683 ,(2009) , 10.1111/J.1467-9868.2008.00693.X