MII: A novel content defined chunking algorithm for finding incremental data in data synchronization

Changjian Zhang; Deyu Qi; Zhe Cai; Wenhao Huang; Xinyang Wang; Wenlin Li; Jing Guo

Journal ArticleOPEN ACCESS

MII: A novel content defined chunking algorithm for finding incremental data in data synchronization

IEEE Access (2019) 7 86932-86945

DOI: 10.1109/ACCESS.2019.2926195

16Citations

9Readers

Abstract

In the data backup system, to reduce the bandwidth and processing time overhead caused by full backup technology during data synchronization between backups and source data, incremental backup technology is emerging as the focus of academic and industrial research. It is key but poorly-solved to find the incremental data between backups and source data for incremental backup technology. To find out the incremental data during the backup process, here, in this paper, we propose a novel content-defined chunking algorithm. The source data and backup data are chunked into some small chunks in the same way with the variable length. Then, by comparing whether a chunk of source data is different from any of the chunks in backup data, we can evaluate whether the chunk of source data is incremental data. By experiments, the chunking algorithm in this paper is compared to other ones which are the classical or state-of-the-art algorithms. The experimental results show that the incremental data found by this algorithm can be reduced by 13%-34% compared to the others with the same chunk throughput.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, C., Qi, D., Cai, Z., Huang, W., Wang, X., Li, W., & Guo, J. (2019). MII: A novel content defined chunking algorithm for finding incremental data in data synchronization. IEEE Access, 7, 86932–86945. https://doi.org/10.1109/ACCESS.2019.2926195

MII: A novel content defined chunking algorithm for finding incremental data in data synchronization

Abstract

Author supplied keywords

Cite

Register to see more suggestions