Streaming sums in sublinear space.

作者: Vladimir Braverman , Stephen R. Chestnut

DOI:

关键词: Domain (mathematical analysis)MathematicsCombinatoricsGeometric meanMonotonic functionDiscrete mathematicsSublinear functionFunction (mathematics)Space (mathematics)Moment (mathematics)Harmonic mean

摘要: Given a stream $S$ of length $m$ with element frequencies $f_i$, $i\in[n]$, and function ${g:\mathbb{N}\to\mathbb{R}}$, we consider the problem computing $(1\pm\epsilon)$-approximation to $\sum g(f_i)$. Most previous efforts have focused on specific functions, for example $p$-th moment $g(x)=|x|^p$ or other norms. The main contributions this paper are complete classification space necessary approximating periodic decreasing up polylogarithmic factors, sublinear algorithm non-monotone functions satisfying relatively simple sufficient condition. This is first streaming complexity characterization that not just monotonically increasing, it improvement general frequency sums since 2010. Our sharpest results characterize nonincreasing in terms domain size \emph{and} length. In contrast approximations moments, turns out storage required approximate depends delicately stream. We apply our derive sublinear-space harmonic mean, geometric cumulative histogram frequencies.

参考文章(20)
Yi Li, David P. Woodruff, A Tight Lower Bound for High Frequency Moment Estimation with Small Error Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. pp. 623- 638 ,(2013) , 10.1007/978-3-642-40328-6_43
Moses Charikar, Kevin Chen, Martin Farach-Colton, Finding Frequent Items in Data Streams international colloquium on automata languages and programming. ,vol. 312, pp. 693- 703 ,(2002) , 10.1016/S0304-3975(03)00400-6
Vladimir Braverman, Rafail Ostrovsky, Recursive Sketching For Frequency Moments arXiv: Data Structures and Algorithms. ,(2010)
Sumit Ganguly, Polynomial Estimators for High Frequency Moments arXiv: Data Structures and Algorithms. ,(2011)
Amit Chakrabarti, Khanh Do Ba, S. Muthukrishnan, Estimating entropy and entropy norm on data streams symposium on theoretical aspects of computer science. ,vol. 3, pp. 196- 205 ,(2006) , 10.1007/11672142_15
Sudipto Guha, Piotr Indyk, Andrew McGregor, Sketching information divergences conference on learning theory. pp. 424- 438 ,(2007) , 10.1007/978-3-540-72927-3_31
Alexandr Andoni, Robert Krauthgamer, Krzysztof Onak, Streaming Algorithms via Precision Sampling 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science. pp. 363- 372 ,(2011) , 10.1109/FOCS.2011.82