Pragmatically informative image captioning with character-level inference

Reuben Cohn-Gordon; Noah Goodman; Chris Potts

Conference ProceedingsOPEN ACCESS

Pragmatically informative image captioning with character-level inference

NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference (2018) 2 439-443

DOI: 10.18653/v1/n18-2070

47Citations

147Readers

Abstract

We combine a neural image captioner with a Rational Speech Acts (RSA) model to make a system that is pragmatically informative: Its objective is to produce captions that are not merely true but also distinguish their inputs from similar images. Previous attempts to combine RSA with neural image captioning require an inference which normalizes over the entire set of possible utterances. This poses a serious problem of efficiency, previously solved by sampling a small subset of possible utterances. We instead solve this problem by implementing a version of RSA which operates at the level of characters ("a","b","c", . ) during the unrolling of the caption. We find that the utterance-level effect of referential captions can be obtained with only characterlevel decisions. Finally, we introduce an automatic method for testing the performance of pragmatic speaker models, and show that our model outperforms a non-pragmatic baseline as well as a word-level RSA captioner.

Cite

CITATION STYLE

APA

Cohn-Gordon, R., Goodman, N., & Potts, C. (2018). Pragmatically informative image captioning with character-level inference. In NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference (Vol. 2, pp. 439–443). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n18-2070

Pragmatically informative image captioning with character-level inference

Abstract

Cite

Register to see more suggestions