Reducing unknown unknowns with guidance in image caption

Mengjun Ni; Jing Yang; Xin Lin; Liang He

Conference Proceedings

Reducing unknown unknowns with guidance in image caption

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10614 LNCS 547-555

DOI: 10.1007/978-3-319-68612-7_62

1Citations

3Readers

Get full text

Abstract

Deep recurrent models applied in Image Caption, which link up computer vision and natural language processing, have achieved excellent results enabling automatically generating natural sentences describing an image. However, the mismatch of sample distribution between training data and the open world may leads to tons of hiding-in-dark Unknown Unknowns (UUs). And such errors may greatly harm the correctness of generated captions. In this paper, we present a framework targeting on UUs reduction and model optimization based on recurrently training with small amounts of external data detected under assistance of crowd commonsense. We demonstrate and analyze our method with currently state-of-the-art image-to-text model. Aiming at reducing the number of UUs in generated captions, we obtain over 12% of UUs reduction and reinforcement of model cognition on these scenes.

Author supplied keywords

Cite

CITATION STYLE

APA

Ni, M., Yang, J., Lin, X., & He, L. (2017). Reducing unknown unknowns with guidance in image caption. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10614 LNCS, pp. 547–555). Springer Verlag. https://doi.org/10.1007/978-3-319-68612-7_62

Reducing unknown unknowns with guidance in image caption

Abstract

Author supplied keywords

Cite

Register to see more suggestions