Dynamic Maintenance of Wavelet-Based Histograms

作者: Jeffrey Scott Vitter , Yossi Matias , Min Wang

DOI:

关键词: Pattern recognitionStationary wavelet transformComputer scienceFast wavelet transformDiscrete wavelet transformWavelet transformWavelet packet decompositionWaveletContinuous wavelet transformHarmonic wavelet transformArtificial intelligenceLifting schemeSecond-generation wavelet transformHistogram

摘要: In this paper, we introduce an e cient method for the dynamic maintenance of wavelet-based histograms (and other transform-based histograms). Previous work has shown that provide more accurate selectivity estimation than traditional histograms, such as equi-depth histograms. But since are built by a nontrivial mathematical procedure, namely, wavelet transform decomposition, it is hard to maintain accuracy histogram when underlying data distribution changes over time. particular, simple techniques, split and merge, which works well updating xed set coe cients, not suitable here. We propose novel approach based upon probabilistic counting sampling waveletbased with very little online time space costs. The our robust changing distributions, get considerable improvement previous methods A nice feature can be extended naturally multidimensional while less prohibitively expensive build maintain.

参考文章(18)
D. Donjerkovic, Y. Ioannidis, R. Ramakrishnan, Dynamic Histograms: Capturing Evolving Data Sets international conference on data engineering. pp. 86- ,(2000) , 10.1109/ICDE.2000.839394
Jeffrey Scott Vitter, Min Wang, Approximate computation of multidimensional aggregates of sparse data using wavelets ACM SIGMOD Record. ,vol. 28, pp. 193- 204 ,(1999) , 10.1145/304181.304199
Yossi Matias, Jeffrey Scott Vitter, Min Wang, Wavelet-based histograms for selectivity estimation Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98. ,vol. 27, pp. 448- 459 ,(1998) , 10.1145/276304.276344
Philippe Flajolet, G. Nigel Martin, Probabilistic counting algorithms for data base applications Journal of Computer and System Sciences. ,vol. 31, pp. 182- 209 ,(1985) , 10.1016/0022-0000(85)90041-8
Gregory Piatetsky-Shapiro, Charles Connell, Accurate estimation of the number of tuples satisfying a condition Proceedings of the 1984 ACM SIGMOD international conference on Management of data - SIGMOD '84. ,vol. 14, pp. 256- 276 ,(1984) , 10.1145/602259.602294
Gurmeet Singh Manku, Sridhar Rajagopalan, Bruce G. Lindsay, Approximate medians and other quantiles in one pass and with limited memory Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98. ,vol. 27, pp. 426- 435 ,(1998) , 10.1145/276304.276342
Björn Jawerth, Wim Sweldens, An Overview of Wavelet Based Multiresolution Analyses SIAM Review. ,vol. 36, pp. 377- 412 ,(1994) , 10.1137/1036095
Jeffrey Scott Vitter, Min Wang, Bala Iyer, Data cube approximation and histograms via wavelets conference on information and knowledge management. pp. 96- 104 ,(1998) , 10.1145/288627.288645
M. Muralikrishna, David J. DeWitt, Equi-depth multidimensional histograms international conference on management of data. ,vol. 17, pp. 28- 36 ,(1988) , 10.1145/971701.50205
Ju-Hong Lee, Deok-Hwan Kim, Chin-Wan Chung, Multi-dimensional selectivity estimation using compressed histogram information ACM SIGMOD Record. ,vol. 28, pp. 205- 214 ,(1999) , 10.1145/304181.304200