Information about the antecedents of pronouns is considered essential to solve certain translation divergencies, such as those concerning the English pronoun it when translated into gendered languages, e.g. for French into il, elle, or several other options. However, no machine translation system using anaphora resolution has so far been able to outperform a phrase-based statistical MT baseline. We address here one of the reasons for this failure: the imperfection of automatic anaphora resolution algorithms. Using parallel data, we learn probabilistic correlations between target-side pronouns and the gender and number features of their (uncertain) antecedents, as hypothesized by the Stanford Coreference Resolution system on the source side. We embody these correlations into a secondary translation model, which we invoke upon decoding with the Moses statistical phrase-based MT system. This solution outperforms a deterministic pronoun post-editing system, as well as a statistical MT baseline, on automatic and human evaluation metrics.
CITATION STYLE
Luong, N. Q., & Popescu-Belis, A. (2016). Improving Pronoun Translation by Modeling Coreference Uncertainty. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 12–20). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-2202
Mendeley helps you to discover research relevant for your work.