作者: Yongjoo Park , Michael Cafarella , Barzan Mozafari
关键词: Computer science 、 Data mining 、 Process (computing) 、 Space (commercial competition) 、 Machine learning 、 Java hashCode 、 Locality-sensitive hashing 、 Hash function 、 Artificial intelligence 、 Identification (information)
摘要: Approximate kNN (k-nearest neighbor) techniques using binary hash functions are among the most commonly used approaches for overcoming prohibitive cost of performing exact queries. However, success these largely depends on their functions' ability to distinguish items; that is, items retrieved based data items' hashcodes, should include as many true possible. A widely-adopted principle this process is ensure similar assigned same hashcode so with hashcodes a query's likely be neighbors.In work, we abandon heavily-utilized and pursue opposite direction generating more effective tasks. That aim increase distance between in space, instead reducing it. Our contribution begins by providing theoretical analysis why revolutionary seemingly counter-intuitive approach leads accurate identification items. followed proposal hashing algorithm embeds novel principle. empirical studies confirm idea significantly improves efficiency accuracy state-of-the-art techniques.