Abstract
Automatic image captioning is a challenging issue in artificial intelligence, which covers both the fields of computer vision and natural language processing. Inspired by the later advances in machine translation, a successful encoder-decoder technique is currently the state-of-the-art in English language captioning. In this study, we proposed an image captioning model for Turkish Language. This paper evaluates the encoderdecoder model on MS COCO database by coupling an encoder Convolutional Neural Network (CNN)-the component that is responsible for extracting the features of the given images-, with a decoder Recurrent Neural Network (RNN)-the component that is responsible for generating captions using the given inputs-to generate Turkish captions. We conducted the experiments using the most common evaluation metrics such as BLEU, METEOR, ROUGE and CIDEr. Results show that the performance of the proposed model is satisfactory in both qualitative and quantitative evaluations. Finally, this study introduces a Web platform (http://mscoco-contributor.herokuapp.com/website/), which is proposed to improve the dataset via crowdsourcing and free to use. The Turkish MS COCO dataset is available for research purpose. When all the images are completed, a Turkish dataset will be available for comparative studies.
Author supplied keywords
Cite
CITATION STYLE
Yildiz, T., Sönmez, E. B., Yilmaz, B. D., & Demir, A. E. (2020). Image captioning in Turkish language: Database and model. Journal of the Faculty of Engineering and Architecture of Gazi University, 35(4), 2089–2100. https://doi.org/10.17341/gazimmfd.597089
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.