Using Word Embeddings to Enforce Document-Level Lexical Consistency in Machine Translation

  • Garcia E
  • Creus C
  • España-Bonet C
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We integrate new mechanisms in a document-level machine translation decoder to improve the lexical consistency of document translations. First, we develop a document-level feature designed to score the lexical consistency of a translation. This feature, which applies to words that have been translated into different forms within the document, uses word embeddings to measure the adequacy of each word translation given its context. Second, we extend the decoder with a new stochastic mechanism that, at translation time, allows to introduce changes in the translation oriented to improve its lexical consistency. We evaluate our system on English–Spanish document translation, and we conduct automatic and manual assessments of its quality. The automatic evaluation metrics, applied mainly at sentence level, do not reflect significant variations. On the contrary, the manual evaluation shows that the system dealing with lexical consistency is preferred over both a standard sentence-level and a standard document-level phrase-based MT systems.

Cite

CITATION STYLE

APA

Garcia, E. M., Creus, C., España-Bonet, C., & Màrquez, L. (2017). Using Word Embeddings to Enforce Document-Level Lexical Consistency in Machine Translation. The Prague Bulletin of Mathematical Linguistics, 108(1), 85–96. https://doi.org/10.1515/pralin-2017-0011

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free