Research on Chunking Algorithms of Data De-duplication

作者: Cai Bo , Zhang Feng Li , Wang Can

DOI: 10.1007/978-3-642-31698-2_144

关键词: Computer scienceStorage areaData deduplicationData redundancyBandwidth (computing)BackupAlgorithm

摘要: Data de-duplication is a technology of detecting data redundancy, and often used to reduce the storage space network bandwidth. Now it one hottest research topics in backup area. In this paper, five representative chunking algorithms are introduced their performance on real set compared. The experiment result shows that these methods improved obviously from whole-file TTTD chunking. According analysis features, we can provide some references for systems choose best algorithm eliminating redundancy.

参考文章(23)
Kave Eshghi, Mark Lillibridge, Deepavali Bhagwat, Peter Camble, Vinay Deolalikar, Greg Trezise, Sparse indexing: large scale, inline deduplication using sampling and locality file and storage technologies. pp. 111- 123 ,(2009)
Fred Douglis, Arun Iyengar, Application-specific Delta-encoding via Resemblance Detection. usenix annual technical conference. pp. 113- 126 ,(2003)
Timothy E. Denehy, Windsor W. Hsu, Duplicate Management for Reference Data ,(2004)
Kai Li, Hugo Patterson, Benjamin Zhu, Avoiding the disk bottleneck in the data domain deduplication file system file and storage technologies. pp. 18- ,(2008)
Calicrates Policroniades, Ian Pratt, Alternatives for detecting redundancy in storage systems data usenix annual technical conference. pp. 6- 6 ,(2004)
D. Bhagwat, K. Pollack, D.D.E. Long, T. Schwarz, E.L. Miller, J.-F. Paris, Providing High Reliability in a Minimum Redundancy Archival Storage System modeling, analysis, and simulation on computer and telecommunication systems. pp. 413- 421 ,(2006) , 10.1109/MASCOTS.2006.42
Sean Dorward, Sean Quinlan, Venti: A New Approach to Archival Storage file and storage technologies. pp. 89- 101 ,(2002)
Erik Kruus, Cristian Ungureanu, Cezary Dubnicki, Bimodal content defined chunking for backup streams file and storage technologies. pp. 18- 18 ,(2010) , 10.5555/1855511.1855529
Keren Jin, Ethan L. Miller, The effectiveness of deduplication on virtual machine disk images Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference on - SYSTOR '09. pp. 7- ,(2009) , 10.1145/1534530.1534540