Creating Hardware Component Knowledge Bases with Training Data Generation and Multi-task Learning

5Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.

Abstract

Hardware component databases are vital resources in designing embedded systems. Since creating these databases requires hundreds of thousands of hours of manual data entry, they are proprietary, limited in the data they provide, and have random data entry errors. We present a machine learning based approach for creating hardware component databases directly from datasheets. Extracting data directly from datasheets is challenging because: (1) the data is relational in nature and relies on non-local context, (2) the documents are filled with technical jargon, and (3) the datasheets are PDFs, a format that decouples visual locality from locality in the document. Addressing this complexity has traditionally relied on human input, making it costly to scale. Our approach uses a rich data model, weak supervision, data augmentation, and multi-task learning to create these knowledge bases in a matter of days. We evaluate the approach on datasheets of three types of components and achieve an average quality of 77 F1 points-quality comparable to existing human-curated knowledge bases. We perform application studies that demonstrate the extraction of multiple data modalities including numerical properties and images. We show how different sources of supervision such as heuristics and human labels have distinct advantages that can be utilized together to improve knowledge base quality. Finally, we present a case study to show how this approach changes the way practitioners create hardware component knowledge bases.

Cite

CITATION STYLE

APA

Hsiao, L., Wu, S., Chiang, N., Ré, C., & Levis, P. (2020). Creating Hardware Component Knowledge Bases with Training Data Generation and Multi-task Learning. ACM Transactions on Embedded Computing Systems, 19(6). https://doi.org/10.1145/3391906

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free