Secure large-scale genome-wide association studies using homomorphic encryption

作者: Marcelo Blatt , Alexander Gusev , Yuriy Polyakov , Shafi Goldwasser

DOI: 10.1073/PNAS.1918257117

关键词: Genome-wide association studyNode (computer science)Performance resultsComputer scienceData miningScale (descriptive set theory)Homomorphic encryptionSecure multi-party computationEncryptionSingle server

摘要: Genome-wide association studies (GWASs) seek to identify genetic variants associated with a trait, and have been powerful approach for understanding complex diseases. A critical challenge GWASs has the dependence on individual-level data that typically strict privacy requirements, creating an urgent need methods preserve of participants. Here, we present privacy-preserving framework based several advances in homomorphic encryption demonstrate it can perform accurate GWAS analysis real dataset more than 25,000 individuals, keeping all individual encrypted requiring no user interactions. Our extrapolations show evaluate 100,000 individuals 500,000 single-nucleotide polymorphisms (SNPs) 5.6 h single server node (or 11 min 31 nodes running parallel). performance results are one order magnitude faster prior state-of-the-art using secure multiparty computation, which requires continuous interactions, accuracy both solutions being similar. also be applied other domains where large-scale statistical analyses over needed.

参考文章(24)
Gilad Asharov, Abhishek Jain, Adriana López-Alt, Eran Tromer, Vinod Vaikuntanathan, Daniel Wichs, Multiparty Computation with Low Communication, Computation and Interaction via Threshold FHE Advances in Cryptology – EUROCRYPT 2012. pp. 483- 501 ,(2012) , 10.1007/978-3-642-29011-4_29
Joshua C Denny, Lisa Bastarache, Marylyn D Ritchie, Robert J Carroll, Raquel Zink, Jonathan D Mosley, Julie R Field, Jill M Pulley, Andrea H Ramirez, Erica Bowton, Melissa A Basford, David S Carrell, Peggy L Peissig, Abel N Kho, Jennifer A Pacheco, Luke V Rasmussen, David R Crosslin, Paul K Crane, Jyotishman Pathak, Suzette J Bielinski, Sarah A Pendergrass, Hua Xu, Lucia A Hindorff, Rongling Li, Teri A Manolio, Christopher G Chute, Rex L Chisholm, Eric B Larson, Gail P Jarvik, Murray H Brilliant, Catherine A McCarty, Iftikhar J Kullo, Jonathan L Haines, Dana C Crawford, Daniel R Masys, Dan M Roden, Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data Nature Biotechnology. ,vol. 31, pp. 1102- 1110 ,(2013) , 10.1038/NBT.2749
Krina T Zondervan, Lon R Cardon, None, The complex interplay among factors that influence allelic association Nature Reviews Genetics. ,vol. 5, pp. 89- 100 ,(2004) , 10.1038/NRG1270
Craig Gentry, Fully homomorphic encryption using ideal lattices Proceedings of the 41st annual ACM symposium on Symposium on theory of computing - STOC '09. pp. 169- 178 ,(2009) , 10.1145/1536414.1536440
Steven E. Brenner, Be prepared for the big genome leak Nature. ,vol. 498, pp. 139- 139 ,(2013) , 10.1038/498139A
Peter D. Sasieni, From genotypes to genes: doubling the sample size. Biometrics. ,vol. 53, pp. 1253- 1261 ,(1997) , 10.2307/2533494
Chia-Yen Chen, Samuela Pollack, David J. Hunter, Joel N. Hirschhorn, Peter Kraft, Alkes L. Price, Improved ancestry inference using weights from external reference panels Bioinformatics. ,vol. 29, pp. 1399- 1406 ,(2013) , 10.1093/BIOINFORMATICS/BTT144
Karolina Sikorska, Emmanuel Lesaffre, Patrick FJ Groenen, Paul HC Eilers, GWAS on your notebook: Fast semi-parallel linear and logistic regression for genome-wide association studies BMC Bioinformatics. ,vol. 14, pp. 166- 166 ,(2013) , 10.1186/1471-2105-14-166
Vinod Vaikuntanathan, Zvika Brakerski, Craig Gentry, Leveled) fully homomorphic encryption without bootstrapping conference on innovations in theoretical computer science. pp. 309- 325 ,(2012) , 10.1145/2090236.2090262
M. Gymrek, A. L. McGuire, D. Golan, E. Halperin, Y. Erlich, Identifying personal genomes by surname inference. Science. ,vol. 339, pp. 321- 324 ,(2013) , 10.1126/SCIENCE.1229566