作者: Rupesh Choubey , Li Chen , Elke A. Rundensteiner
关键词: Tree (data structure) 、 Spatial database 、 R-tree 、 Outlier 、 Data set 、 Computer science 、 R+ tree 、 Cluster analysis 、 Search engine indexing 、 Data mining
摘要: A lot of recent work has studied strategies related to bulk loading large data sets into multidimensional index structures. In this paper, we address the problem insertions existing structures with particular focus on R-trees - which are an important class used widely in commercial database systems. We propose a new technique, as opposed current technique inserting one by one, inserts entire incoming datasets active R-tree. This called GBI (for Generalized Bulk Insertion), partitions clusters and outliers, constructs R-tree (small tree) from each cluster, identifies prepares suitable locations original (large for insertion, lastly performs small trees outliers tree bulk. Our experimental studies demonstrate that does especially well (over 200% better than technique) randomly located real contain few natural clusters, while also consistently outperforming alternate all other circumstances.