Clustering files in deduplication systems

作者: Menezes Guilherme , Reza Abdullah

DOI:

关键词:

摘要: Clustering files in deduplication systems is based on an estimate of similarity between a file system. The estimates are how much content the share, where shared segments shared. segment offsets found files' bitmap vectors offsets. used to generate cluster definition approximating optimal data structure for clustering that share content. approximated defines clusters hierarchically arranged offset numbers

参考文章(15)
Matthew R. McDANIEL, Arthur Beaverson, Techniques using identifiers and signatures with data operations ,(2006)
Mihnea Andrei, Anil Kumar Goel, Rolando Blanco, In-memory bitmap for column store operations ,(2013)
Ashish Batwara, Nisha Talagala, David Flynn, Swaminathan Sundararaman, Nick Piggin, Hybrid checkpointed memory ,(2013)
Umesh Maheshwari, R. Hugo Patterson, Locality-based stream segmentation for data deduplication ,(2006)
Windsor W. Hsu, R. Hugo Patterson, System and method for providing long-term storage for data ,(2010)
Michael W. Healey, Arthur Beaverson, Techniques for performing a prioritized data restoration operation ,(2006)
R. Hugo Patterson, Kai Li, Ming Benjamin Zhu, Efficient data storage system ,(2010)
Matthew R. McDANIEL, Michael W. Healey, Arthur Beaverson, Techniques for performing a restoration operation using device scanning ,(2006)
Daniel S. Collins, Russell R. Laporte, Processing vectorized elements associated with IT system images ,(2013)