作者: Sumit Mukherjee , Thanneer Perumal , Kenneth Daily , Solveig Sieberts , Larsson Omberg
DOI: 10.1101/534305
关键词:
摘要: ABSTRACT Motivation Late onset Alzheimers disease (LOAD) is currently a with no known effective treatment options. To address this, there have been recent surge in the generation of multi-modality data (Hodes and Buckholtz, 2016; Mueller et al., 2005) to understand biology potential drivers that causally regulate it. However, most analytic studies using these data-sets focus on uni-modal analysis data. Here we propose data-driven approach integrate multiple types outcomes aggregate evidences support hypothesis gene genetic driver disease. The main algorithmic contributions our paper are: i) A general machine learning framework learn key characteristics few genes from feature-sets identifying other which similar feature representations, ii) flexible ranking scheme ability external validation form Genome Wide Association Study (GWAS) summary statistics. While demonstrating effectiveness different RNA-Seq studies, this method easily generalizable modalities types. Results We demonstrate utility algorithm two benchmark multi-view datasets by significantly outperforming baseline approaches predicting missing labels. then use predict rank Alzheimers. show ranked significant enrichment for SNPs associated Alzheimers, are enriched pathways previously Availability Source code link all sets availabile at https://github.com/Sage-Bionetworks/EvidenceAggregatedDriverRanking. Contact ben.logsdon@sagebionetworks.org