In this paper we explore several neural network architectures for the WMT 2017 multimodal translation sub-task on multilingual image caption generation. The goal of the task is to generate image captions in German, using a training corpus of images with captions in both English and German. We explore several models which attempt to generate captions for both languages, ignoring the English output during evaluation. We compare the results to a baseline implementation which uses only the German captions for training and show significant improvement.
CITATION STYLE
Jaffe, A. (2017). Generating image descriptions using multilingual data. In WMT 2017 - 2nd Conference on Machine Translation, Proceedings (pp. 458–464). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-4750
Mendeley helps you to discover research relevant for your work.