作者: Carlos Sáez , Oscar Zurriaga , Jordi Pérez-Panadés , Inma Melchor , Montserrat Robles
DOI: 10.1093/JAMIA/OCW010
关键词:
摘要: Objective To assess the variability in data distributions among sources and over time through a case study of large multisite repository as systematic approach to quality (DQ). Materials Methods Novel probabilistic DQ control methods based on information theory geometry are applied Public Health Mortality Registry Region Valencia, Spain, with 512 143 entries from 2000 2012, disaggregated into 24 health departments. The provide metrics exploratory visualizations for (1) assessing multiple (2) monitoring exploring changes time. suited big multitype, multivariate, multimodal data. Results was partitioned 2 probabilistically separated temporal subgroups following change Spanish National Death Certificate 2009. Punctual anomalies were noticed due punctual increment missing data, along outlying clustered departments differences populations or practices. Discussion Changes protocols, populations, biased practices, other problems affected variability. Even if semantic integration aspects addressed sharing infrastructures, may still be present. Solutions include fixing excluding analyzing different sites periods separately. A is proposed. Conclusion Multisite affects DQ, hindering reuse, an assessment such should part procedures.