Nonparametric method for data-driven image captioning

93Citations
Citations of this article
147Readers
Mendeley users who have this article in their library.

Abstract

We present a nonparametric density estimation technique for image caption generation. Data-driven matching methods have shown to be effective for a variety of complex problems in Computer Vision. These methods reduce an inference problem for an unknown image to finding an existing labeled image which is semantically similar. However, related approaches for image caption generation (Ordonez et al., 2011; Kuznetsova et al., 2012) are hampered by noisy estimations of visual content and poor alignment between images and human-written captions. Our work addresses this challenge by estimating a word frequency representation of the visual content of a query image. This allows us to cast caption generation as an extractive summarization problem. Our model strongly outperforms two state-of-the-art caption extraction systems according to human judgments of caption relevance. © 2014 Association for Computational Linguistics.

Cite

CITATION STYLE

APA

Mason, R., & Charniak, E. (2014). Nonparametric method for data-driven image captioning. In 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference (Vol. 2, pp. 592–598). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p14-2097

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free