Two-stage method to remove population- and individual-level outliers from longitudinal data in a primary care database.

作者: C. Welch , I. Petersen , K. Walters , R. W. Morris , I. Nazareth

DOI: 10.1002/PDS.2270

关键词:

摘要: PURPOSE: In the UK, primary care databases include repeated measurements of health indicators at individual level. As these encompass a large population, some individuals have extreme values, but values may also be recorded incorrectly. The challenge for researchers is to distinguish between records that are due incorrect recording and those which represent true values. This study evaluated different methods identify outliers. METHODS: Ten percent practices were selected random evaluate 513,367 height measurements. Population-level outliers identified using boundaries defined Health Survey England data. Individual-level by fitting random-effects model with subject-specific slopes adjusted age sex. Any patient-level standardised residual more than ±10 as an outlier excluded. was subsequently refitted twice after removing each stage. method compared existing RESULTS: Most population level (1550 1643). Once removed from database, remaining data successfully only 75 further efficient identifying methods. CONCLUSIONS: We propose new, two-stage approach in longitudinal show it can both Copyright © 2011 John Wiley & Sons, Ltd.

参考文章(12)
Foster D McClure, Jung-Keun Lee, Dennis B Wilson, Validity of the Percent Reduction in Standard Deviation Outlier Test for Screening Laboratory Means from a Collaborative Study Journal of AOAC International. ,vol. 86, pp. 1045- 1055 ,(2003) , 10.1093/JAOAC/86.5.1045
Vincenzo Verardi, Catherine Dehon, Multivariate outlier detection in Stata Stata Journal. ,vol. 10, pp. 259- 266 ,(2010) , 10.1177/1536867X1001000206
M J Healy, Outliers in clinical chemistry quality-control schemes. Clinical Chemistry. ,vol. 25, pp. 675- 677 ,(1979) , 10.1093/CLINCHEM/25.5.675
Paul S Horn, Lan Feng, Yanmei Li, Amadeo J Pesce, Effect of Outliers and Nonhealthy Individuals on Reference Interval Estimation Clinical Chemistry. ,vol. 47, pp. 2137- 2145 ,(2001) , 10.1093/CLINCHEM/47.12.2137
W. J. Dixon, Analysis of Extreme Values Annals of Mathematical Statistics. ,vol. 21, pp. 488- 506 ,(1950) , 10.1214/AOMS/1177729747
Da CHEN, Xueguang SHAO, Bin HU, Qingde SU, Simultaneous wavelength selection and outlier detection in multivariate regression of near-infrared spectra Analytical Sciences. ,vol. 21, pp. 161- 166 ,(2005) , 10.2116/ANALSCI.21.161
Andrew Maguire, Betina T. Blak, Mary Thompson, The importance of defining periods of complete mortality reporting for research using automated data from primary care Pharmacoepidemiology and Drug Safety. ,vol. 18, pp. 76- 83 ,(2009) , 10.1002/PDS.1688
Kevin Hayes, Anthony Kinsella, Norma Coffey, None, A note on the use of outlier criteria in Ontario laboratory quality control schemes. Clinical Biochemistry. ,vol. 40, pp. 147- 152 ,(2007) , 10.1016/J.CLINBIOCHEM.2006.08.019