#PraCegoVer: A Large Dataset for Image Captioning in Portuguese

6Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Automatically describing images using natural sentences is essential to visually impaired people’s inclusion on the Internet. This problem is known as Image Captioning. There are many datasets in the literature, but most contain only English captions, whereas datasets with captions described in other languages are scarce. We introduce the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese. In contrast to popular datasets, #PraCegoVer has only one reference per image, and both mean and variance of reference sentence length are significantly high, which makes our dataset challenging due to its linguistic aspect. We carry a detailed analysis to find the main classes and topics in our data. We compare #PraCegoVer to MS COCO dataset in terms of sentence length and word frequency. We hope that #PraCegoVer dataset encourages more works addressing the automatic generation of descriptions in Portuguese.

Cite

CITATION STYLE

APA

Dos Santos, G. O., Colombini, E. L., & Avila, S. (2022). #PraCegoVer: A Large Dataset for Image Captioning in Portuguese. Data, 7(2). https://doi.org/10.3390/data7020013

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free