Measuring the behavioral impact of machine translation quality improvements with A/B testing

Benjamin Russell; Duncan Gillespie

Conference ProceedingsOPEN ACCESS

Measuring the behavioral impact of machine translation quality improvements with A/B testing

EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (2016) 2295-2299

DOI: 10.18653/v1/d16-1251

1Citations

83Readers

Abstract

In this paper we discuss a process for quantifying the behavioral impact of a domain-customized machine translation system deployed on a large-scale e-commerce platform. We discuss several machine translation systems that we trained using aligned text from product listing descriptions written in multiple languages. We document the quality improvements of these systems as measured through automated quality measures and crowdsourced human quality assessments. We then measure the effect of these quality improvements on user behavior using an automated A/B testing framework. Through testing we observed an increase in key e-commerce metrics, including a significant increase in purchases.

Cite

CITATION STYLE

APA

Russell, B., & Gillespie, D. (2016). Measuring the behavioral impact of machine translation quality improvements with A/B testing. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 2295–2299). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d16-1251

Measuring the behavioral impact of machine translation quality improvements with A/B testing

Abstract

Cite

Register to see more suggestions