Reducing cohort bias in natural language understanding systems with targeted self-training scheme

Dieu Thu Le; Gabriela Cortes; Bei Chen; Melanie Bradford

Conference Proceedings

Reducing cohort bias in natural language understanding systems with targeted self-training scheme

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 5 552-560

DOI: 10.18653/v1/2023.acl-industry.53

2Citations

13Readers

Get full text

Abstract

Bias in machine learning models can be an issue when the models are trained on particular types of data that do not generalize well, causing under performance in certain groups of users. In this work, we focus on reducing the bias related to new customers in a digital voice assistant system. It is observed that natural language understanding models often have lower performance when dealing with requests coming from new users rather than experienced users. To mitigate this problem, we propose a framework that consists of two phases (1) a fxing phase with four active learning strategies used to identify important samples coming from new users, and (2) a self training phase where a teacher model trained from the frst phase is used to annotate semi-supervised samples to expand the training data with relevant cohort utterances. We explain practical strategies that involve an identifcation of representative cohort-based samples through density clustering as well as employing implicit customer feedbacks to improve new customers' experience. We demonstrate the effectiveness of our approach in a real world large scale voice assistant system for two languages, German and French through a number of experiments.

Cite

CITATION STYLE

APA

Le, D. T., Cortes, G., Chen, B., & Bradford, M. (2023). Reducing cohort bias in natural language understanding systems with targeted self-training scheme. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 5, pp. 552–560). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-industry.53

Reducing cohort bias in natural language understanding systems with targeted self-training scheme

Abstract

Cite

Register to see more suggestions