Persistent key-value stores play a crucial role in enabling internet-scale services. At Alibaba Cloud, scale-out cloud storage services including Object Storage Service, File Storage Service and Tablestore are built on distributed key-value stores. Key challenges in the design of the underlying key-value engine for these services lie in utilization of disaggregated storage, supporting write and range query-heavy workloads, and balancing of scalability, availability and resource usage. This paper presents ArkDB, a key-value engine designed to address these challenges by combining advantages of both LSM tree and Bw-tree, and leveraging advances in hardware technologies. Built on top of Pangu, an append-only distributed file system, ArkDB's innovations include shrinkable page mapping table, clear separation of system and user states for fast recovery, write amplification reduction, efficient garbage collection and lightweight partition split and merge. Experimental results demonstrate ArkDB's improvements over existing designs. Compared with Bw-tree, ArkDB efficiently stabilizes the mapping table size despite continuous write working set growth. Compared with RocksDB, an LSM tree-based key-value engine, ArkDB increases ingestion throughput by 2.16x, while reducing write amplification by 3.1x. It outperforms RocksDB by 52% and 37% respectively on a write-heavy workload and a range query-intensive workload of the Yahoo! Cloud Serving Benchmark. Experiments running in Tablestore in a cluster environment further demonstrate ArkDB's performance on Pangu and its efficient partition split/merge support.
CITATION STYLE
Pang, Z., Lu, Q., Chen, S., Wang, R., Xu, Y., & Wu, J. (2021). ArkDB: A Key-Value Engine for Scalable Cloud Storage Services. In Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 2570–2583). Association for Computing Machinery. https://doi.org/10.1145/3448016.3457553
Mendeley helps you to discover research relevant for your work.