作者: Pablo Ferri Borreda , Carlos Saez , Juan Miguel Garcia Gomez
关键词:
摘要: The degree of homogeneity statistical distributions among data sources is a critical issue when reusing Integrated Data Repositories (IDR). Evaluating this source stability utmost importance in order to ensure confident reuse. This work tackles the task discovering and classifying patterns multiple IDRs, by means novel approach based on simplicial projections from probability distribution distances, combined with Density-based spatial clustering applications noise (DBSCAN). results evaluated 20 public repositories support existence four main biomedical repositories: global pattern (GSP), local (LSP), sparse (SSP) instability (IP).