Fast Compaction Algorithms for NoSQL Databases

作者: Mainak Ghosh , Indranil Gupta , Shalmoli Gupta , Nirman Kumar

DOI: 10.1109/ICDCS.2015.53

关键词:

摘要: Compaction plays a crucial role in NoSQL systems to ensure high overall read throughput. In this work, we formally define compaction as an optimization problem that attempts minimize disk I/O. We prove be NPHard. then propose set of algorithms and mathematically analyze upper bounds on worst-case cost. evaluate the proposed real-life workloads. Our results show our incur low I/O costs approach using balanced tree is most preferable.

参考文章(13)
Neal E. Young, Claire Mathieu, Carl Staelin, K-Slot SSTable Stack Compaction. ,(2014)
L. Lovász, Submodular functions and convexity Mathematical Programming The State of the Art. pp. 235- 257 ,(1983) , 10.1007/978-3-642-68874-4_10
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber, Bigtable ACM Transactions on Computer Systems. ,vol. 26, pp. 1- 26 ,(2008) , 10.1145/1365815.1365816
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears, Benchmarking cloud serving systems with YCSB Proceedings of the 1st ACM symposium on Cloud computing - SoCC '10. pp. 143- 154 ,(2010) , 10.1145/1807128.1807152
Gerth Stolting Brodal, Rolf Fagerberg, Lower bounds for external memory dictionaries symposium on discrete algorithms. pp. 546- 554 ,(2003) , 10.5555/644108.644201
Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran, Cache-Oblivious Algorithms ACM Transactions on Algorithms. ,vol. 8, pp. 1- 22 ,(2012) , 10.1145/2071379.2071383
David Huffman, A Method for the Construction of Minimum-Redundancy Codes Proceedings of the IRE. ,vol. 40, pp. 1098- 1101 ,(1952) , 10.1109/JRPROC.1952.273898
Avinash Lakshman, Prashant Malik, Cassandra: a decentralized structured storage system Operating Systems Review. ,vol. 44, pp. 35- 40 ,(2010) , 10.1145/1773912.1773922
Michael Kozuch, Dejan Milojicic, David O'Hallaron, Steven Y. Ko, Kevin Lai, Hing Yan Lee, Martha Lyons, Indranil Gupta, Michael Heath, Roy Campbell, Yeng Chai Soh, Thomas Kwan, Marcel Kunze, Open Cirrus TM cloud computing testbed: federated data centers for open source systems and services research ieee international conference on cloud computing technology and science. pp. 1- ,(2009)
Chandra Chekuri, Rajeev Motwani, Precedence constrained scheduling to minimize sum of weighted completion times on a single machine Discrete Applied Mathematics. ,vol. 98, pp. 29- 38 ,(1999) , 10.1016/S0166-218X(98)00143-7