Unraveling the Contribution of Image Captioning and Neural Machine Translation for Multimodal Machine Translation

  • Lala C
  • Madhyastha P
  • Wang J
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Recent work on multimodal machine translation has attempted to address the problem of producing target language image descriptions based on both the source language description and the corresponding image. However, existing work has not been conclusive on the contribution of visual information. This paper presents an in-depth study of the problem by examining the differences and complementarities of two related but distinct approaches to this task: textonly neural machine translation and image captioning. We analyse the scope for improvement and the effect of different data and settings to build models for these tasks. We also propose ways of combining these two approaches for improved translation quality.

Cite

CITATION STYLE

APA

Lala, C., Madhyastha, P., Wang, J., & Specia, L. (2017). Unraveling the Contribution of Image Captioning and Neural Machine Translation for Multimodal Machine Translation. The Prague Bulletin of Mathematical Linguistics, 108(1), 197–208. https://doi.org/10.1515/pralin-2017-0020

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free