The Gaussian Bloom Filter

作者: Martin Werner , Mirco Schönfeld

DOI: 10.1007/978-3-319-18120-2_12

关键词:

摘要: Modern databases tailored to highly distributed, fault tolerant management of information for big data applications exploit a classical structure reducing disk and network I/O as well managing distribution: The Bloom filter. This allows encode small sets elements, typically the keys in key-value store, into small, constant-size structure. In order reduce memory consumption, this suffers from false positives which lead additional operations are therefore only harmful with respect performance. With paper, we propose an extension filter construction facilitates use floating point coprocessors GPUs or main positives. proposed is compatible sense that can be extracted time linear size special case our construction. We show approach provides relevant gain positive rate. Implementations Apache Cassandra, C++, NVIDIA CUDA given support feasibility results approach.

参考文章(10)
Mirco Schönfeld, Martin Werner, Node Wake-Up via OVSF-Coded Bloom Filters in Wireless Sensor Networks ad hoc networks. pp. 119- 134 ,(2013) , 10.1007/978-3-319-04105-6_8
Michael Mitzenmacher, John Byers, Jeffrey Considine, Fast Approximate Reconciliation of Set Differences Boston University Computer Science Department. ,(2002)
P. Jones, rd D. Eastlake, US Secure Hash Algorithm 1 (SHA1) RFC. ,vol. 3174, pp. 1- 22 ,(2001)
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber, Bigtable ACM Transactions on Computer Systems. ,vol. 26, pp. 1- 26 ,(2008) , 10.1145/1365815.1365816
Andrei Broder, Michael Mitzenmacher, Network Applications of Bloom Filters: A Survey Internet Mathematics. ,vol. 1, pp. 485- 509 ,(2004) , 10.1080/15427951.2004.10129096
A. Whitaker, D. Wetherall, Forwarding without loops in Icarus 2002 IEEE Open Architectures and Network Programming Proceedings. OPENARCH 2002 (Cat. No.02EX571). pp. 63- 75 ,(2002) , 10.1109/OPNARC.2002.1019229
Wu-Chang Feng, D.D. Kandlur, D. Saha, K.G. Shin, Stochastic fair blue: a queue management algorithm for enforcing fairness international conference on computer communications. ,vol. 3, pp. 1520- 1529 ,(2001) , 10.1109/INFCOM.2001.916648
Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, The Hadoop Distributed File System ieee conference on mass storage systems and technologies. pp. 1- 10 ,(2010) , 10.1109/MSST.2010.5496972
Burton H. Bloom, Space/time trade-offs in hash coding with allowable errors Communications of the ACM. ,vol. 13, pp. 422- 426 ,(1970) , 10.1145/362686.362692
Nicol N. Schraudolph, A Fast, Compact Approximation of the Exponential Function Neural Computation. ,vol. 11, pp. 853- 862 ,(1999) , 10.1162/089976699300016467