Preserving-Ignoring Transformation Based Index for Approximate k Nearest Neighbor Search

作者: Gang Hu , Jie Shao , Dongxiang Zhang , Yang Yang , Heng Tao Shen

DOI: 10.1109/ICDE.2017.47

关键词:

摘要: Locality sensitive hashing (LSH) and its variants are widely used for approximate kNN (k nearest neighbor) search in high-dimensional space. The success of these techniques largely depends on the ability preserving information. Unfortunately, LSH only provides a high probability that nearby points original space projected into region new This potentially makes many false positives negatives resulting from unrelated points. Many extensions aim to alleviate above issue by improving distance ability. In this paper, we abound function but propose novel idea enhance performance transforming data before applying LSH. A preserving-ignoring transformation (PIT) satisfying some rigorous conditions can be convert an interim with strict capacity. Based property, linear order is utilized build efficient index structure Finally, applied candidate set searched our final results. Experiments conducted proposed approach performs better than state-of-the-art methods SK-LSH, DSH NSH terms both accuracy efficiency.

参考文章(13)
Yingfan Liu, Jiangtao Cui, Zi Huang, Hui Li, Heng Tao Shen, SK-LSH Proceedings of the VLDB Endowment. ,vol. 7, pp. 745- 756 ,(2014) , 10.14778/2732939.2732947
Jinyang Gao, Hosagrahar Visvesvaraya Jagadish, Wei Lu, Beng Chin Ooi, None, DSH: data sensitive hashing for high-dimensional k-nnsearch international conference on management of data. pp. 1127- 1138 ,(2014) , 10.1145/2588555.2588565
Alexis Joly, Olivier Buisson, A posteriori multi-probe locality sensitive hashing Proceeding of the 16th ACM international conference on Multimedia - MM '08. pp. 209- 218 ,(2008) , 10.1145/1459359.1459388
Junhao Gan, Jianlin Feng, Qiong Fang, Wilfred Ng, Locality-sensitive hashing scheme based on dynamic collision counting Proceedings of the 2012 international conference on Management of Data - SIGMOD '12. pp. 541- 552 ,(2012) , 10.1145/2213836.2213898
Piotr Indyk, Rajeev Motwani, Approximate nearest neighbors: towards removing the curse of dimensionality symposium on the theory of computing. pp. 604- 613 ,(1998) , 10.1145/276698.276876
Zhongming Jin, Yao Hu, Yue Lin, Debing Zhang, Shiding Lin, Deng Cai, Xuelong Li, Complementary Projection Hashing international conference on computer vision. pp. 257- 264 ,(2013) , 10.1109/ICCV.2013.39
Yufei Tao, Ke Yi, Cheng Sheng, Panos Kalnis, Quality and efficiency in high dimensional nearest neighbor search Proceedings of the 35th SIGMOD international conference on Management of data - SIGMOD '09. pp. 563- 576 ,(2009) , 10.1145/1559845.1559905
Moses Charikar, William Josephson, Qin Lv, Kai Li, Zhe Wang, Multi-probe LSH: efficient indexing for high-dimensional similarity search very large data bases. pp. 950- 961 ,(2007)
Shih-fu Chang, Sanjiv Kumar, Wei Liu, Jun Wang, Hashing with Graphs international conference on machine learning. pp. 1- 8 ,(2011)
Jae-Pil Heo, Youngwoon Lee, Junfeng He, Shih-Fu Chang, Sung-Eui Yoon, None, Spherical hashing computer vision and pattern recognition. pp. 2957- 2964 ,(2012)