Neighbor-sensitive hashing

作者： Yongjoo Park , Michael Cafarella , Barzan Mozafari

关键词: Computer science 、 Data mining 、 Process (computing) 、 Space (commercial competition) 、 Machine learning 、 Java hashCode 、 Locality-sensitive hashing 、 Hash function 、 Artificial intelligence 、 Identification (information)

摘要: Approximate kNN (k-nearest neighbor) techniques using binary hash functions are among the most commonly used approaches for overcoming prohibitive cost of performing exact queries. However, success these largely depends on their functions' ability to distinguish items; that is, items retrieved based data items' hashcodes, should include as many true possible. A widely-adopted principle this process is ensure similar assigned same hashcode so with hashcodes a query's likely be neighbors.In work, we abandon heavily-utilized and pursue opposite direction generating more effective tasks. That aim increase distance between in space, instead reducing it. Our contribution begins by providing theoretical analysis why revolutionary seemingly counter-intuitive approach leads accurate identification items. followed proposal hashing algorithm embeds novel principle. empirical studies confirm idea significantly improves efficiency accuracy state-of-the-art techniques.

参考文章(57)

Sameer Agarwal, Barzan Mozafari, Aurojit Panda, Henry Milner, Samuel Madden, Ion Stoica, BlinkDB Proceedings of the 8th ACM European Conference on Computer Systems - EuroSys '13. pp. 29- 42 ,(2013) , 10.1145/2465351.2465355

David Arthur, Sergei Vassilvitskii, k-means++: the advantages of careful seeding symposium on discrete algorithms. pp. 1027- 1035 ,(2007) , 10.5555/1283383.1283494

Yunchao Gong, Svetlana Lazebnik, Iterative quantization: A procrustean approach to learning binary codes computer vision and pattern recognition. pp. 817- 824 ,(2011) , 10.1109/CVPR.2011.5995432

Bin Cui, Beng Chin Coi, Jianwen Su, K.-L. Tan, Indexing high-dimensional data for efficient in-memory similarity search IEEE Transactions on Knowledge and Data Engineering. ,vol. 17, pp. 339- 353 ,(2005) , 10.1109/TKDE.2005.46

Kai Zeng, Shi Gao, Barzan Mozafari, Carlo Zaniolo, The analytical bootstrap: a new method for fast error estimation in approximate query processing international conference on management of data. pp. 277- 288 ,(2014) , 10.1145/2588555.2588579

Antonin Guttman, R-trees Proceedings of the 1984 ACM SIGMOD international conference on Management of data - SIGMOD '84. ,vol. 14, pp. 47- 57 ,(1984) , 10.1145/602259.602266

M. Norouzi, A. Punjani, D. J. Fleet, Fast search in Hamming space with multi-index hashing computer vision and pattern recognition. pp. 3108- 3115 ,(2012) , 10.1109/CVPR.2012.6248043

Shrikant Kashyap, Panagiotis Karras, Scalable kNN search on vertically stored time series Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11. pp. 1334- 1342 ,(2011) , 10.1145/2020408.2020607

H Jégou, M Douze, C Schmid, Product Quantization for Nearest Neighbor Search IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 33, pp. 117- 128 ,(2011) , 10.1109/TPAMI.2010.57

10.

Narayanan Sundaram, Aizana Turmukhametova, Nadathur Satish, Todd Mostak, Piotr Indyk, Samuel Madden, Pradeep Dubey, Streaming similarity search over one billion tweets using parallel locality-sensitive hashing Proceedings of the VLDB Endowment. ,vol. 6, pp. 1930- 1941 ,(2013) , 10.14778/2556549.2556574

Neighbor-sensitive hashing

来源期刊

我的账户

Neighbor-sensitive hashing

来源期刊

相似文章 10

我的账户