Optimizing hash table structure for digest matching in a data deduplication system

作者: Lior Aronovich

DOI:

关键词: Hash functionAlgorithmHash treeData deduplicationHash tableInformation retrievalLinear hashingDynamic perfect hashingDouble hashingHash busterComputer science

摘要: Repository data intervals are determined as similar to an input interval. digests corresponding the repository interval loaded into a sequential representation and search structure. Matches of found using Each one matches extended representation. Data between digests. A compact index pointing position in is incorporated entries

参考文章(52)
Cai Bo, Zhang Feng Li, Wang Can, Research on Chunking Algorithms of Data De-duplication Proceedings of the 2012 International Conference on Communication, Electronics and Automation Engineering. pp. 1019- 1025 ,(2013) , 10.1007/978-3-642-31698-2_144
Kave Eshghi, Mark Lillibridge, Deepavali Bhagwat, Peter Camble, Vinay Deolalikar, Greg Trezise, Sparse indexing: large scale, inline deduplication using sampling and locality file and storage technologies. pp. 111- 123 ,(2009)
Kai Li, Hugo Patterson, Benjamin Zhu, Avoiding the disk bottleneck in the data domain deduplication file system file and storage technologies. pp. 18- ,(2008)
Sudipta Sengupta, Jin Li, Adaptive Index for Data Deduplication ,(2011)
Ali Mesdaq, Paul L. Westin, Fuzzy hash of behavioral results ,(2013)
Geoff Baum, Thomas Malloy, Walter Chang, Scalable engine that computes user micro-segments for offer matching ,(2011)