PRESIDIO

作者: Lawrence L. You , Kristal T. Pollack , Darrell D. E. Long , K. Gopinath

DOI: 10.1145/1970348.1970351

关键词:

摘要: The ever-increasing volume of archival data that needs to be reliably retained for long periods time and the decreasing costs disk storage, memory, processing have motivated design low-cost, high-efficiency disk-based storage systems. However, managed is still expensive. To further lower cost, redundancy can eliminated with use interfile intrafile compression. it not clear what optimal strategy compressing is, given diverse collections data.To create a scalable system efficiently stores data, we present PRESIDIO, framework selects from different space-reduction efficent methods (ESMs) detect similarity reduce or eliminate when storing objects. In addition, uses virtualized content addressable store (VCAS) hides user complexity knowing which space-efficient techniques are used, including chunk-based deduplication delta Storing retrieving objects polymorphic operations independent their content-based address. A new technique, harmonic super-fingerprinting, also used obtaining successively more accurate (but costly) measures identify existing in very large set most similar an incoming object.The PRESIDIO design, reported earlier, had comprehensively introduced first notion deduplication, now being offered as service systems by major vendors. As aid such systems, evaluate various parameters affect efficiency using empirical data.

参考文章(92)
Jerzy Szczepkowski, Michal Welnicki, Lukasz Heldt, Wojciech Kilian, Cristian Ungureanu, Michal Kaczmarczyk, Przemyslaw Strzelczak, Cezary Dubnicki, Leszek Gryz, HYDRAstor: a Scalable Secondary Storage file and storage technologies. pp. 197- 210 ,(2009)
Kave Eshghi, Mark Lillibridge, Deepavali Bhagwat, Peter Camble, Vinay Deolalikar, Greg Trezise, Sparse indexing: large scale, inline deduplication using sampling and locality file and storage technologies. pp. 111- 123 ,(2009)
Jean-Loup Gailly, Mark Nelson, The data compression book (2nd ed.) MIS:Press. ,(1995)
Fred Douglis, Arun Iyengar, Application-specific Delta-encoding via Resemblance Detection. usenix annual technical conference. pp. 113- 126 ,(2003)
Kiem-Phong Vo, Vcodex: A Data Compression Platform. international conference on software and data technologies. pp. 81- 89 ,(2007)
Timothy James Gibson, Ethan L. Miller, Long-term unix file system activity and the efficacy of automatic file migration University of Maryland at Baltimore County. ,(1998)
Val Henson, An analysis of compare-by-hash hot topics in operating systems. pp. 3- 3 ,(2003)
Torsten Suel, Dimitre Trendafilov, Nasir Memon, zdelta: An efficient delta compression tool Polytechnic University. ,(2002)
Christos T. Karamanolis, Lawrence You, Evaluation of Efficient Archival Storage Techniques. MSST. pp. 227- 232 ,(2004)
Michael Stonebraker, Carl Staelin, John T. Kohl, HighLight: Using a Log-structured File System for Tertiary Storage Management. {USENIX} Winter 1993 Conference ({USENIX} Winter 1993 Conference). pp. 435- 448 ,(1993)