GBI: A Generalized R-Tree Bulk-Insertion Strategy

作者: Rupesh Choubey , Li Chen , Elke A. Rundensteiner

DOI: 10.1007/3-540-48482-5_8

关键词: Tree (data structure)Spatial databaseR-treeOutlierData setComputer scienceR+ treeCluster analysisSearch engine indexingData mining

摘要: A lot of recent work has studied strategies related to bulk loading large data sets into multidimensional index structures. In this paper, we address the problem insertions existing structures with particular focus on R-trees - which are an important class used widely in commercial database systems. We propose a new technique, as opposed current technique inserting one by one, inserts entire incoming datasets active R-tree. This called GBI (for Generalized Bulk Insertion), partitions clusters and outliers, constructs R-tree (small tree) from each cluster, identifies prepares suitable locations original (large for insertion, lastly performs small trees outliers tree bulk. Our experimental studies demonstrate that does especially well (over 200% better than technique) randomly located real contain few natural clusters, while also consistently outperforming alternate all other circumstances.

参考文章(23)
Nick Roussopoulos, Yannis Kotidis, Mema Roussopoulos, Cubetree: organization of and bulk incremental updates on the data cube international conference on management of data. ,vol. 26, pp. 89- 99 ,(1997) , 10.1145/253260.253276
Joe H. Ward, Hierarchical Grouping to Optimize an Objective Function Journal of the American Statistical Association. ,vol. 58, pp. 236- 244 ,(1963) , 10.1080/01621459.1963.10500845
Jianzhong Li, Doron Rotem, Jaideep Srivastava, Algorithms for loading parallel grid files Proceedings of the 1993 ACM SIGMOD international conference on Management of data - SIGMOD '93. ,vol. 22, pp. 347- 356 ,(1993) , 10.1145/170035.170086
Yván J. García R, Mario A. López, Scott T. Leutenegger, A greedy algorithm for bulk loading R-trees advances in geographic information systems. pp. 163- 164 ,(1998) , 10.1145/288692.288723
Li Chen, Rupesh Choubey, Elke A. Rundensteiner, Bulk-insertions into R-trees using the small-tree-large-tree approach advances in geographic information systems. pp. 161- 162 ,(1998) , 10.1145/288692.288722
Ibrahim Kamel, Christos Faloutsos, On packing R-trees Proceedings of the second international conference on Information and knowledge management - CIKM '93. pp. 490- 499 ,(1993) , 10.1145/170088.170403
Weidong Chen, Programming with logical queries, bulk updates, and hypothetical reasoning IEEE Transactions on Knowledge and Data Engineering. ,vol. 9, pp. 587- 599 ,(1997) , 10.1109/69.617052
Antonin Guttman, R-trees Proceedings of the 1984 ACM SIGMOD international conference on Management of data - SIGMOD '84. ,vol. 14, pp. 47- 57 ,(1984) , 10.1145/602259.602266
S.T. Leutenegger, D.M. Nicol, Efficient bulk-loading of gridfiles IEEE Transactions on Knowledge and Data Engineering. ,vol. 9, pp. 410- 420 ,(1997) , 10.1109/69.599930
S.T. Leutenegger, M.A. Lopez, J. Edgington, STR: a simple and efficient algorithm for R-tree packing international conference on data engineering. pp. 497- 506 ,(1997) , 10.1109/ICDE.1997.582015