While a multitude of approaches for extracting semantic information from multimedia documents has emerged in recent years, isolating any form of holistic semantic representation from a larger type of document, such as a movie, is not yet feasible. In this paper we present our approaches used in the first instance of the Deep Video Understanding Challenge, using a combination of several multi-modal detectors and an integration scheme informed by methods from the semantic web context in order to determine the capabilities limitations of currently available methods for the extraction of semantic relations between the characters and locations relevant to the narrative of a movie.
CITATION STYLE
Baumgartner, M., Rossetto, L., & Bernstein, A. (2020). Towards Using Semantic-Web Technologies for Multi-Modal Knowledge Graph Construction. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 4645–4649). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3416292
Mendeley helps you to discover research relevant for your work.