Due to the fast indiscriminate increase of digital data, data reduction has acquired increasing concentration and became a popular approach in large-scale storage systems. One of the most effective approaches for data reduction is Data Deduplication technique in which the redundant data at the file or sub-file level is detected and identifies by using a hash algorithm. Data Deduplication showed that it was much more efficient than the conventional compression technique in large-scale storage systems in terms of space reduction. Two Threshold Two Divisor (TTTD) chunking algorithm is one of the popular chunking algorithm used in deduplication. This algorithm needs time and many system resources to compute its chunk boundary. This paper presents new techniques to enhance TTTD chunking algorithm using a new fingerprint function, a multi-level hashing and matching technique, new indexing technique to store the Metadata. These new techniques consist of four hashing algorithm to solve the collision problem and adding a new chunk condition to the TTTD chunking conditions in order to increase the number of the small chunks which leads to increasing the Deduplication Ratio. This enhancement improves the Deduplication Ratio produced by TTTD algorithm and reduces the system resources needed by this algorithm. The proposed algorithm is tested in terms of Deduplication Ratio, execution time, and Metadata size.
CITATION STYLE
Jasim, H. A. S., & Fahad, A. A. (2018). New techniques to enhance data deduplication using content based-TTTD chunking algorithm. International Journal of Advanced Computer Science and Applications, 9(5), 116–121. https://doi.org/10.14569/IJACSA.2018.090515
Mendeley helps you to discover research relevant for your work.