An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems

4Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

To reduce the storage cost, distributed storage systems are gradually using erasure codes to ensure data reliability. Liberation codes, which satisfy the maximum distance separable (MDS) property and provide optimal modification overhead, are a class of popular two fault tolerant erasure codes. However, erasure codes need to read from surviving nodes and transfer across the network large amounts of data when recovering from single node failures. Existing single node failure recovery approaches for Liberation codes are either time-consuming or suboptimal. In this article, firstly, we prove the minimum number of symbols required to recover one failed node for a Liberation coded system. Then we derive the conditions that optimal recovery solutions need to satisfy. Finally, we propose an algorithm, called Disk Read Optimal Recovery (DROR), which can determine an optimal recovery solution in linear time and recover the failed node reading the minimum amount of data. We have implemented DROR in a real-world storage system Ceph and evaluated DROR on a cluster of Amazon EC2 instances. We show that DROR reduces the reconstruction time by up to 23.6% compared to that of the recovery approach in Ceph.

Cite

CITATION STYLE

APA

Liang, N., Zhang, X., Yang, H., Dong, X., & Zhang, C. (2020). An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems. IEEE Access, 8, 137631–137645. https://doi.org/10.1109/ACCESS.2020.3012190

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free