A Dynamic Data Warehousing Platform for Creating and Accessing Biomedical Data Lakes

作者: Pradeeban Kathiravelu , Ashish Sharma

DOI: 10.1007/978-3-319-57741-8_7

关键词:

摘要: Medical research use cases are population centric, unlike the clinical which patient or individual centric. Hence require accessing medical archives and data source repositories of heterogeneous nature. Traditionally, in order to query from these sources, users manually access download parts whole sources. The existing solutions tend focus on a specific format storage, prevents using them for more generic scenario with sources where user may not have knowledge schema priori. In this paper, we propose discuss design, implementation, evaluation Data Cafe, scalable distributed architecture that aims address shortcomings approaches. Cafe lets resource providers create biomedical lakes various consume efficiently quickly without having priori schema.

参考文章(23)
Vivian S. Gainer, David Berkowicz, Henry C. Chueh, Shawn N. Murphy, Michael Mendis, John P. Glaser, Lori C. Phillips, Isaac S. Kohane, Kristel Hackett, Wensong Pan, Rajesh Kuttan, Architecture of the open-source clinical research chart from Informatics for Integrating Biology and the Bedside. american medical informatics association annual symposium. ,vol. 2007, pp. 548- 552 ,(2007)
Mahadev Konar, Benjamin Reed, Flavio P. Junqueira, Patrick Hunt, ZooKeeper: wait-free coordination for internet-scale systems usenix annual technical conference. pp. 11- 11 ,(2010)
Patrice Degoulet, Marius Fieschi, Medical Decision Support Systems Springer, New York, NY. pp. 153- 167 ,(1997) , 10.1007/978-1-4612-0675-0_12
Anurag Gupta, Deepak Agarwal, Derek Tan, Jakub Kulesza, Rahul Pathak, Stefano Stefani, Vidhya Srinivasan, Amazon Redshift and the Case for Simpler Data Warehouses international conference on management of data. pp. 1917- 1923 ,(2015) , 10.1145/2723372.2742795
Mark Levene, George Loizou, None, Why is the snowflake schema a good data warehouse design? Information Systems. ,vol. 28, pp. 225- 240 ,(2003) , 10.1016/S0306-4379(02)00021-2
Janice C. Honeyman, Walter Huda, Michelle Ott, Meryll M. Frost, William Loeffler, Edward V. Staab, Picture archiving and communications systems (PACS). Current Problems in Diagnostic Radiology. ,vol. 23, pp. 103- 158 ,(1994) , 10.1016/0363-0188(94)90004-3
Michael Hausenblas, Jacques Nadeau, Apache Drill: Interactive Ad-Hoc Analysis at Scale Big data. ,vol. 1, pp. 100- 104 ,(2013) , 10.1089/BIG.2013.0011
George Starkschall, Design specifications for a radiation oncology picture archival and communication system Seminars in Radiation Oncology. ,vol. 7, pp. 21- 30 ,(1997) , 10.1016/S1053-4296(97)80014-8
Kenneth Clark, Bruce Vendt, Kirk Smith, John Freymann, Justin Kirby, Paul Koppel, Stephen Moore, Stanley Phillips, David Maffitt, Michael Pringle, Lawrence Tarbox, Fred Prior, The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository Journal of Digital Imaging. ,vol. 26, pp. 1045- 1057 ,(2013) , 10.1007/S10278-013-9622-7
S. N. Murphy, G. Weber, M. Mendis, V. Gainer, H. C. Chueh, S. Churchill, I. Kohane, Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) Journal of the American Medical Informatics Association. ,vol. 17, pp. 124- 130 ,(2010) , 10.1136/JAMIA.2009.000893