作者: Sumit Ganguly , Graham Cormode
DOI: 10.1007/978-3-540-74208-1_35
关键词: Data stream mining 、 Multivariate random variable 、 Product (mathematics) 、 Distribution (mathematics) 、 Estimator 、 Space (mathematics) 、 Mathematics 、 Data matrix (multivariate statistics) 、 Combinatorics 、 Structure (category theory)
摘要: Space-economical estimation of the pth frequency moments, defined as , for p> 0, are interest in estimating all-pairs distances a large data matrix [14], machine learning, and stream computation. Random sketches formed by inner product vector f 1 ..., n with suitably chosen random were pioneered Alon, Matias Szegedy [1], have since played central role F p computations general. The concept p-stable whose components drawn from distribution, was proposed Indyk 0 < p< 2, has been further studied Li [13]. In this paper, we consider problem sketch [7] structure [5]. Our algorithms require space $\tilde{O}(\frac{1}{\epsilon^{2+p}})$ to estimate within ±i¾?factors requires expected time $O(\log F_1 \log \frac{1}{\delta})$ process each update. Thus, our technique trades an $O(\frac{1}{\epsilon^p})$ factor much more efficient processing updates. We also present stand-alone iterative estimator .