Improving Large-Scale Conversational Assistants using Model Interpretation based Training Sample Selection

1Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Natural language understanding (NLU) models are a core component of large-scale conversational assistants. Collecting training data for these models through manual annotations is slow and expensive that impedes the pace of model improvement. We present a three stage approach to address this challenge: First, we identify a large set of relatively infrequent utterances from live traffic where the users implicitly communicated satisfaction with a response (such as by not interrupting), along with the existing model outputs as candidate annotations. Second, we identify a small subset of these utterances usings Integrated Gradients based importance scores computed with the current models. Finally, we augment our training sets with these utterances and retrain our models. We demonstrate the effectiveness of our approach in a large-scale conversational assistant, processing billions of utterances every week. By augmenting our training set with just 0.05% more utterances through our approach, we observe statistically significant improvements for infrequent tail utterances: a 0.45% reduction in semantic error rate (SemER) in offline experiments, and a 1.23% reduction in defect rates in online A/B tests.

Cite

CITATION STYLE

APA

Schroedl, S., Kumar, M., Hajebi, K., Ziyadi, M., Venkathapaty, S., Ramakrishna, A., … Natarajan, P. (2022). Improving Large-Scale Conversational Assistants using Model Interpretation based Training Sample Selection. In EMNLP 2022 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track (pp. 381–388). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-industry.37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free