作者: Kejing Lu , Mineichi Kudo
DOI: 10.1109/ICDE48307.2020.00095
关键词:
摘要: Locality sensitive hashing (LSH) is a widely practiced c-approximate nearest neighbor (c-ANN) search algorithm because of its appealing theoretical guarantee and empirical performance. However, available LSH-based solutions do not achieve good balance between cost quality certain limitations in their index structures.In this paper, we propose novel easy-to-implement disk- based method named R2LSH to answer ANN queries highdimensional spaces. In the indexing phase, maps data objects into multiple two-dimensional projected each space, group B+-trees constructed characterize corresponding distribution. query by setting query-centric ball space using dynamic counting technique, efficiently determines candidates returns results with required quality. Rigorous analysis reveals that proposed supports c-ANN for arbitrarily small c ≥ 1 probability guarantee. Extensive experiments on real datasets verify superiority over state-of-the-art methods.