Joins on Samples: A Theoretical Guide for Practitioners

作者: Seth Pettie , Barzan Mozafari , Dong Young Yoon , Dawei Huang

DOI:

关键词:

摘要: Despite decades of research on approximate query processing (AQP), our understanding of sample-based joins has remained limited and, to some extent, even superficial. The …

参考文章(56)
Arun Swami, K. Bernhard Schiefer, On the estimation of join result sizes extending database technology. pp. 287- 300 ,(1994) , 10.1007/3-540-57818-8_58
Phillip B. Gibbons, Viswanath Poosala, Swarup Acharya, Aqua: A Fast Decision Support Systems Using Approximate Query Answers very large data bases. pp. 754- 757 ,(1999)
Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya, On random sampling over joins ACM SIGMOD Record. ,vol. 28, pp. 263- 274 ,(1999) , 10.1145/304181.304206
D. J. Thompson, D. G. Horvitz, A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association. ,vol. 47, pp. 663- 685 ,(1952) , 10.2307/2280784
Sameer Agarwal, Henry Milner, Ariel Kleiner, Ameet Talwalkar, Michael Jordan, Samuel Madden, Barzan Mozafari, Ion Stoica, Knowing when you're wrong: building fast and reliable approximate query processing systems international conference on management of data. pp. 481- 492 ,(2014) , 10.1145/2588555.2593667
Rasmus Pagh, Morten Stöckel, David P. Woodruff, Is min-wise hashing optimal for summarizing set intersection? symposium on principles of database systems. pp. 109- 120 ,(2014) , 10.1145/2594538.2594554
Peter J. Haas, Joseph M. Hellerstein, Ripple joins for online aggregation ACM SIGMOD Record. ,vol. 28, pp. 287- 298 ,(1999) , 10.1145/304181.304208
Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy, Join synopses for approximate query answering ACM SIGMOD Record. ,vol. 28, pp. 275- 286 ,(1999) , 10.1145/304181.304207
Chris Jermaine, Minos Garofalakis, Peter J. Haas, Graham Cormode, Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches ,(2012)
Brian Babcock, Surajit Chaudhuri, Gautam Das, Dynamic sample selection for approximate query processing international conference on management of data. pp. 539- 550 ,(2003) , 10.1145/872757.872822