Cost-effective Deployment of BERT Models in a Serverless Environment

0Citations
Citations of this article
57Readers
Mendeley users who have this article in their library.

Abstract

In this study we demonstrate the viability of deploying BERT-style models to serverless environments in a production setting. Since the freely available pre-trained models are too large to be deployed in this way, we utilize knowledge distillation and fine-tune the models on proprietary datasets for two real-world tasks: sentiment analysis and semantic textual similarity. As a result, we obtain models that are tuned for a specific domain and deployable in serverless environments. The subsequent performance analysis shows that this solution results in latency levels acceptable for production use and that it is also a cost-effective approach for small-to-medium size deployments of BERT models, all without any infrastructure overhead.

Cite

CITATION STYLE

APA

Benešová, K., Švec, A., & Šuppa, M. (2021). Cost-effective Deployment of BERT Models in a Serverless Environment. In NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Industry Papers (pp. 187–195). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.naacl-industry.24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free