Abstract
The development workflow for today's AI applications has grown far beyond the standard model training task. This workflow typically consists of various data and model management tasks. It includes a "data cycle"aimed at producing high-quality training data, and a "model cycle"aimed at managing trained models on their way to production. This broadened workflow has opened a space for already emerging tools and systems for AI development. However, as a research community, we are still missing standardized ways to evaluate these tools and systems. In a humble effort to get this wheel turning, we developed dcbench, a benchmark for evaluating systems for data-centric AI development. In this report, we present the main ideas behind dcbench, some benchmark tasks that we included in the initial release, and a short summary of its implementation.
Author supplied keywords
Cite
CITATION STYLE
Eyuboglu, S., Karlaš, B., Ré, C., Zhang, C., & Zou, J. (2022). dcbench: A Benchmark for Data-Centric AI Systems. In Proceedings of the 6th Workshop on Data Management for End-To-End Machine Learning, DEEM 2022 - In conjunction with the 2022 ACM SIGMOD/PODS Conference. Association for Computing Machinery, Inc. https://doi.org/10.1145/3533028.3533310
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.