dcbench: A Benchmark for Data-Centric AI Systems

18Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The development workflow for today's AI applications has grown far beyond the standard model training task. This workflow typically consists of various data and model management tasks. It includes a "data cycle"aimed at producing high-quality training data, and a "model cycle"aimed at managing trained models on their way to production. This broadened workflow has opened a space for already emerging tools and systems for AI development. However, as a research community, we are still missing standardized ways to evaluate these tools and systems. In a humble effort to get this wheel turning, we developed dcbench, a benchmark for evaluating systems for data-centric AI development. In this report, we present the main ideas behind dcbench, some benchmark tasks that we included in the initial release, and a short summary of its implementation.

Cite

CITATION STYLE

APA

Eyuboglu, S., Karlaš, B., Ré, C., Zhang, C., & Zou, J. (2022). dcbench: A Benchmark for Data-Centric AI Systems. In Proceedings of the 6th Workshop on Data Management for End-To-End Machine Learning, DEEM 2022 - In conjunction with the 2022 ACM SIGMOD/PODS Conference. Association for Computing Machinery, Inc. https://doi.org/10.1145/3533028.3533310

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free