作者: Richard T. Guy , Peter Santago , Carl D. Langefeld
DOI: 10.1002/GEPI.21608
关键词:
摘要: Complex genetic disorders are a result of combination and nongenetic factors, all potentially interacting. Machine learning methods hold the potential to identify multilocus environmental associations thought drive complex traits. Decision trees, popular machine technique, offer computationally low complexity algorithm capable detecting associated sets single nucleotide polymorphisms (SNPs) arbitrary size, including modern genome-wide SNP scans. However, interpretation importance an individual within these trees can present challenges. We new decision tree denoted as Bagged Alternating Trees (BADTrees) that is based on identifying common structural elements in bootstrapped set (ADTrees). The order nk(2), where n number SNPs considered k constructed. Our simulation study suggests BADTrees have higher power lower type I error rates than ADTrees alone comparable with compared logistic regression. illustrate application data using simulated well from Lupus Large Association Study 1 (7,822 3,548 individuals). results suggest promise computational for combinations factors disease.