Quality Scoring of Source Words in Machine Translations

1Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

Word-level quality scores on input source sentences can provide useful feedback to an end-user when translating into an unfamiliar target language. Recent approaches either require training custom models on synthetic data or repeatedly invoking the translation model. We propose a simple approach based on comparing probabilities from two language models. The basic premise of our method is to reason how well each source word is explained by the generated translation as against the preceding source language words. Our approach provides between 2.2 and 27.1 higher F1 score and is significantly faster than state of the art methods on three language pairs. Also, our method does not require training any new model. We release a public dataset on word omissions and mistranslations on a new language pair.

Cite

CITATION STYLE

APA

Jain, P., Sarawagi, S., & Tomar, T. (2022). Quality Scoring of Source Words in Machine Translations. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 10683–10691). Association for Computational Linguistics (ACL).

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free