A hybrid technique for estimating frequency moments over data streams

作者: Sumit Ganguly

DOI:

关键词:

摘要: The problem of estimating the k frequency moment Fk, for any non-negative integral value k, over a data stream by looking at items exactly once as they arrive, was considered in seminal paper Alon, Matias and Szegedy [1, 2]. They present sampling based algorithm to estimate Fk where, ≥ 2, using space O(n1−1/k)). Coppersmith Kumar [7] [10], different methods, algorithms with complexity O(n1−1/(k−1)). In this paper, we an O(n1−2/(k+1)), > thereby, improving compared 7, 10] 4.

参考文章(11)
Moses Charikar, Kevin Chen, Martin Farach-Colton, Finding Frequent Items in Data Streams international colloquium on automata languages and programming. ,vol. 312, pp. 693- 703 ,(2002) , 10.1016/S0304-3975(03)00400-6
Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, Luca Trevisan, Counting Distinct Elements in a Data Stream randomization and approximation techniques in computer science. pp. 1- 10 ,(2002) , 10.1007/3-540-45726-7_1
Don Coppersmith, Ravi Kumar, An improved data stream algorithm for frequency moments symposium on discrete algorithms. pp. 151- 156 ,(2004) , 10.5555/982792.982815
Philippe Flajolet, G. Nigel Martin, Probabilistic counting algorithms for data base applications Journal of Computer and System Sciences. ,vol. 31, pp. 182- 209 ,(1985) , 10.1016/0022-0000(85)90041-8
Anna C. Gilbert, Sudipto Guha, Piotr Indyk, Yannis Kotidis, S. Muthukrishnan, Martin J. Strauss, Fast, small-space algorithms for approximate histogram maintenance symposium on the theory of computing. pp. 389- 398 ,(2002) , 10.1145/509907.509966
Michael Saks, Xiaodong Sun, Space lower bounds for distance approximation in the data stream model symposium on the theory of computing. pp. 360- 369 ,(2002) , 10.1145/509907.509963
Noga Alon, Yossi Matias, Mario Szegedy, The Space Complexity of Approximating the Frequency Moments Journal of Computer and System Sciences. ,vol. 58, pp. 137- 147 ,(1999) , 10.1006/JCSS.1997.1545
Ziv Bar-Yossef, T.S. Jayram, Ravi Kumar, D. Sivakumar, An information statistics approach to data stream and communication complexity foundations of computer science. ,vol. 68, pp. 209- 218 ,(2002) , 10.1016/J.JCSS.2003.11.006
Sumit Ganguly, Minos Garofalakis, Rajeev Rastogi, Processing set expressions over continuous update streams international conference on management of data. pp. 265- 276 ,(2003) , 10.1145/872757.872790
A. Chakrabarti, S. Khot, Xiaodong Sun, Near-optimal lower bounds on the multi-party communication complexity of set disjointness conference on computational complexity. pp. 107- 117 ,(2003) , 10.1109/CCC.2003.1214414