Xbase

作者: Wen-Syan Li , Jianfeng Yan , Ying Yan , Jin Zhang

DOI: 10.1145/1739041.1739125

关键词: xBaseXMLComputer scienceInformation applianceCloud computingHealth careSystems architectureSearch engine indexingDatabaseRelational database management system

摘要: XML is a more desirable format for modeling and storing clinical data in EMR (Electronic medical record) applications its extendibility; however, existing systems either are built on top of RDBMS or file lack support complex large scale healthcare applications, such as treatment effectiveness analysis procedure optimization. SAP Technology Lab, China developing clouds-enabled information appliance, Xbase, Hadoop, which the first XML-based appliance designed specifically applications. presents different set challenges query processing, indexing, parallelism, distributed computing using Hadoop's APIs well HDFS storage infrastructure MapReduce framework. In this paper, we describe system architecture internal designs Xbase how indexing mapped to Hadoop. We also discuss why select Hadoop over other candidates, Hbase, Google's Bigtable, Hive.

参考文章(11)
Tova Milo, Dan Suciu, Index Structures for Path Expressions international conference on database theory. pp. 277- 295 ,(1999) , 10.1007/3-540-49257-7_18
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins, Pig latin Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD '08. pp. 1099- 1110 ,(2008) , 10.1145/1376616.1376726
Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, Raghotham Murthy, Hive: a warehousing solution over a map-reduce framework very large data bases. ,vol. 2, pp. 1626- 1629 ,(2009) , 10.14778/1687553.1687609
Muthu Dayalan, , MapReduce: simplified data processing on large clusters operating systems design and implementation. ,vol. 5, pp. 10- 10 ,(2004) , 10.21276/IJRE.2018.5.5.4
Burton H. Bloom, Space/time trade-offs in hash coding with allowable errors Communications of the ACM. ,vol. 13, pp. 422- 426 ,(1970) , 10.1145/362686.362692
Haifeng Jiang, Xuemin Lin, Jianzhong Li, Hongjun Lu, Hongzhi Wang, Wei Wang, Efficient processing of XML path queries using the disk-based F&B Index very large data bases. pp. 145- 156 ,(2005)
Jennifer Widom, Roy Goldman, DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases very large data bases. pp. 436- 445 ,(1997)
Alan F. Gates, Olga Natkovich, Shubham Chopra, Pradeep Kamath, Shravan M. Narayanamurthy, Christopher Olston, Benjamin Reed, Santhosh Srinivasan, Utkarsh Srivastava, Building a high-level dataflow system on top of Map-Reduce: the Pig experience very large data bases. ,vol. 2, pp. 1414- 1425 ,(2009) , 10.14778/1687553.1687568
Gang Gou, Rada Chirkova, Efficiently Querying Large XML Data Repositories: A Survey IEEE Transactions on Knowledge and Data Engineering. ,vol. 19, pp. 1381- 1403 ,(2007) , 10.1109/TKDE.2007.1060
Raghav Kaushik, Philip Bohannon, Jeffrey F Naughton, Henry F Korth, Covering indexes for branching path queries Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD '02. pp. 133- 144 ,(2002) , 10.1145/564691.564707