作者: C. Welch , I. Petersen , K. Walters , R. W. Morris , I. Nazareth
DOI: 10.1002/PDS.2270
关键词:
摘要: PURPOSE: In the UK, primary care databases include repeated measurements of health indicators at individual level. As these encompass a large population, some individuals have extreme values, but values may also be recorded incorrectly. The challenge for researchers is to distinguish between records that are due incorrect recording and those which represent true values. This study evaluated different methods identify outliers. METHODS: Ten percent practices were selected random evaluate 513,367 height measurements. Population-level outliers identified using boundaries defined Health Survey England data. Individual-level by fitting random-effects model with subject-specific slopes adjusted age sex. Any patient-level standardised residual more than ±10 as an outlier excluded. was subsequently refitted twice after removing each stage. method compared existing RESULTS: Most population level (1550 1643). Once removed from database, remaining data successfully only 75 further efficient identifying methods. CONCLUSIONS: We propose new, two-stage approach in longitudinal show it can both Copyright © 2011 John Wiley & Sons, Ltd.