MII: A novel content defined chunking algorithm for finding incremental data in data synchronization

16Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In the data backup system, to reduce the bandwidth and processing time overhead caused by full backup technology during data synchronization between backups and source data, incremental backup technology is emerging as the focus of academic and industrial research. It is key but poorly-solved to find the incremental data between backups and source data for incremental backup technology. To find out the incremental data during the backup process, here, in this paper, we propose a novel content-defined chunking algorithm. The source data and backup data are chunked into some small chunks in the same way with the variable length. Then, by comparing whether a chunk of source data is different from any of the chunks in backup data, we can evaluate whether the chunk of source data is incremental data. By experiments, the chunking algorithm in this paper is compared to other ones which are the classical or state-of-the-art algorithms. The experimental results show that the incremental data found by this algorithm can be reduced by 13%-34% compared to the others with the same chunk throughput.

Cite

CITATION STYLE

APA

Zhang, C., Qi, D., Cai, Z., Huang, W., Wang, X., Li, W., & Guo, J. (2019). MII: A novel content defined chunking algorithm for finding incremental data in data synchronization. IEEE Access, 7, 86932–86945. https://doi.org/10.1109/ACCESS.2019.2926195

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free