Demonstration of VerdictDB, the Platform-Independent AQP System

作者: Wen He , Yongjoo Park , Idris Hanafi , Jacob Yatvitskiy , Barzan Mozafari

DOI: 10.1145/3183713.3193538

关键词:

摘要: We demonstrate VerdictDB, the first platform-independent approximate query processing (AQP) system. Unlike existing AQP systems that are tightly-integrated into a specific database, VerdictDB operates at driver-level, acting as middleware between users and off-the-shelf database systems. In other words, requires no modifications to internals; it simply relies on rewriting incoming queries such standard execution of rewritten under relational semantics yields answers original queries. exploits novel technique for error estimation called variational subsampling, which is amenable efficient computation via SQL. this demonstration, we showcase VerdictDB's performance benefits (up two orders magnitude) compared issued directly engines. also illustrate returned by nearly identical exact answers. use Apache Spark SQL Amazon Redshift examples modern distributed platforms. allow audience explore using web-based interface (e.g., Hue or Zeppelin) issue visualize their currently open-sourced available License (V2).

参考文章(28)
Barzan Mozafari, Carlo Zaniolo, Optimal load shedding with aggregates and mining queries 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010). pp. 76- 88 ,(2010) , 10.1109/ICDE.2010.5447867
Sameer Agarwal, Henry Milner, Ariel Kleiner, Ameet Talwalkar, Michael Jordan, Samuel Madden, Barzan Mozafari, Ion Stoica, Knowing when you're wrong: building fast and reliable approximate query processing systems international conference on management of data. pp. 481- 492 ,(2014) , 10.1145/2588555.2593667
Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy, Join synopses for approximate query answering ACM SIGMOD Record. ,vol. 28, pp. 275- 286 ,(1999) , 10.1145/304181.304207
Abhijit Pol, Christopher Jermaine, Relational confidence bounds are easy with the bootstrap Proceedings of the 2005 ACM SIGMOD international conference on Management of data - SIGMOD '05. pp. 587- 598 ,(2005) , 10.1145/1066157.1066224
Sameer Agarwal, Anand P. Iyer, Aurojit Panda, Samuel Madden, Barzan Mozafari, Ion Stoica, Blink and it's done Proceedings of the VLDB Endowment. ,vol. 5, pp. 1902- 1905 ,(2012) , 10.14778/2367502.2367533
Barzan Mozafari, Eugene Zhen Ye Goh, Dong Young Yoon, CliffGuard: A Principled Framework for Finding Robust Database Designs international conference on management of data. pp. 1167- 1182 ,(2015) , 10.1145/2723372.2749454
Sai Wu, Beng Chin Ooi, Kian-Lee Tan, Continuous sampling for online aggregation over multiple queries Proceedings of the 2010 international conference on Management of data - SIGMOD '10. pp. 651- 662 ,(2010) , 10.1145/1807167.1807238
Barzan Mozafari, Carlo Curino, Alekh Jindal, Samuel Madden, Performance and resource modeling in highly-concurrent OLTP workloads international conference on management of data. pp. 301- 312 ,(2013) , 10.1145/2463676.2467800
Sameer Agarwal, Barzan Mozafari, Aurojit Panda, Henry Milner, Samuel Madden, Ion Stoica, BlinkDB Proceedings of the 8th ACM European Conference on Computer Systems - EuroSys '13. pp. 29- 42 ,(2013) , 10.1145/2465351.2465355
Kai Zeng, Shi Gao, Barzan Mozafari, Carlo Zaniolo, The analytical bootstrap: a new method for fast error estimation in approximate query processing international conference on management of data. pp. 277- 288 ,(2014) , 10.1145/2588555.2588579