作者: BRITTANY M. HOLLISTER , NICOLE A. RESTREPO , ERIC FARBER-EGER , DANA C. CRAWFORD , MELINDA C. ALDRICH
DOI: 10.1142/9789813207813_0023
关键词: Psychology 、 Medicaid 、 Social class 、 MEDLINE 、 Biobank 、 Data extraction 、 Social environment 、 Algorithm 、 Socioeconomic status 、 Biorepository
摘要: Socioeconomic status (SES) is a fundamental contributor to health, and key factor underlying racial disparities in disease. However, SES data are rarely included genetic studies due part the difficultly of collecting these when were not originally designed for that purpose. The emergence large clinic-based biobanks linked electronic health records (EHRs) provides research access patient populations with longitudinal phenotype captured structured fields as billing codes, procedure prescriptions. however, often explicitly recorded fields, but rather free text clinical notes communications. content completeness vary widely by practitioner. To enable gene-environment consider an exposure, we sought extract variables from racial/ethnic minority adult patients (n=9,977) BioVU, Vanderbilt University Medical Center biorepository de-identified EHRs. We developed several measures using information available within EHR, including broad categories occupation, education, insurance status, homelessness. Two hundred randomly selected manual review develop set seven algorithms extracting consist 15 information, 830 unique search terms. extracted 50 compared produced algorithm, resulting positive predictive values 80.0% (education), 85.4% (occupation), 87.5% (unemployment), 63.6% (retirement), 23.1% (uninsured), 81.8% (Medicaid), 33.3% (homelessness), suggesting some easier this EHR than others. extraction approach here will future EHR-based integrate into statistical analyses. Ultimately, incorporation help elucidate impact social environment on disease risk outcomes.