VTKEL: A resource for visual-textual-knowledge entity linking

13Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

To understand the content of a document containing both text and pictures, an artificial agent needs to jointly recognize the entities shown in the pictures and mentioned in the text, and to link them to its background knowledge. This is a complex task, that we call Visual-Textual-Knowledge Entity Linking (VTKEL), which aims at linking visual and textual entity mentions to the corresponding entity (or a newly created one) of the agent knowledge base. Solving the VTKEL task opens a wide range of opportunities for improving semantic visual interpretation. For instance, given the effectiveness and robustness of state-of-the-art NLP technologies in entity linking, by automatically linking visual and textual mentions of the same entities with the ontology, we can obtain a huge amount of automatically annotated images with detailed categories. In this paper, we propose the VTKEL dataset, consisting of images and corresponding captions, in which the image and textual mentions are both annotated with the corresponding entities typed according to the YAGO ontology. The VTKEL dataset can be used for training and evaluating algorithms for visual-textual-knowledge entity linking.

Cite

CITATION STYLE

APA

Dost, S., Serafini, L., Rospocher, M., Ballan, L., & Sperduti, A. (2020). VTKEL: A resource for visual-textual-knowledge entity linking. In Proceedings of the ACM Symposium on Applied Computing (pp. 2021–2028). Association for Computing Machinery. https://doi.org/10.1145/3341105.3373958

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free