Partially parametric techniques for multiple imputation

作者: Nathaniel Schenker , Jeremy M.G. Taylor

DOI: 10.1016/0167-9473(95)00057-7

关键词:

摘要: Abstract Multiple imputation is a technique for handling data sets with missing values. The method fills in the values several times, creating completed analysis. Each set analyzed separately using techniques designed complete data, and results are then combined such way that variability due to may be incorporated. Methods of imputing can vary from fully parametric nonparametric. In this paper, we compare partially regression-based multiple-imputation methods. consider imputes regression outcomes by drawing them their predictive distribution under model, whereas methods based on or residuals incomplete cases drawn cases. For methods, suggest new approach choosing which draw Monte Carlo study setting, investigate robustness schemes misspecification underlying model data. Sources considered include incorrect modeling mean structure as well specification error regard heaviness tails heteroscedasticity. compared respect bias efficiency point estimates coverage rates confidence intervals marginal function outcome. We find when specified correctly, all perform well, even if misspecified. approach, however, produces slightly more efficient outcome than do approaches. When misspecified, still estimating mean, although shows slight increases variance. function, breaks down situations, maintain good performance. an application AIDS research setting similar complicated study, examine how time infection HIV onset used impute residual subjects right-censored produce results, suggesting selection was adequate. Our provides example multiple combine information two cohorts estimate quantities cannot estimated directly either one separately.

参考文章(35)
Daniel F Heitjan, Roderick JA Little, None, Multiple Imputation for the Fatal Accident Reporting System Applied Statistics. ,vol. 40, pp. 13- 29 ,(1991) , 10.2307/2347902
D B Rubin, T E Raghunathan, X L Meng, K H Li, Significance levels from repeated p-values with multiply imputed data Statistica Sinica. ,vol. 1, pp. 65- 92 ,(1991)
Xiao-Li Meng, Multiple-Imputation Inferences with Uncongenial Sources of Input Statistical Science. ,vol. 9, pp. 538- 558 ,(1994) , 10.1214/SS/1177010269
Richard A Olshen, Charles J Stone, Leo Breiman, Jerome H Friedman, Classification and regression trees ,(1983)
Donald B. Rubin, Nathaniel Schenker, Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse Journal of the American Statistical Association. ,vol. 81, pp. 366- 374 ,(1986) , 10.1080/01621459.1986.10478280
Donald B. Rubin, Statistical Matching Using File Concatenation With Adjusted Weights and Multiple Imputations Journal of Business & Economic Statistics. ,vol. 4, pp. 87- 94 ,(1986) , 10.1080/07350015.1986.10509497
Daniel F. Heitjan, J. Richard Landis, Assessing Secular Trends in Blood Pressure: A Multiple-Imputation Approach Journal of the American Statistical Association. ,vol. 89, pp. 750- 759 ,(1994) , 10.1080/01621459.1994.10476808
Bradley Efron, Missing Data, Imputation, and the Bootstrap Journal of the American Statistical Association. ,vol. 89, pp. 463- 475 ,(1994) , 10.1080/01621459.1994.10476768
Margaret Sullivan Pepe, Thomas R. Fleming, A Nonparametric Method for Dealing with Mismeasured Covariate Data Journal of the American Statistical Association. ,vol. 86, pp. 108- 113 ,(1991) , 10.1080/01621459.1991.10475009