Question Answering for Visual Navigation in Human-Centered Environments

Daniil E. Kirilenko; Alexey K. Kovalev; Evgeny Osipov; Aleksandr I. Panov

Conference Proceedings

Question Answering for Visual Navigation in Human-Centered Environments

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2021) 13068 LNAI 31-45

DOI: 10.1007/978-3-030-89820-5_3

2Citations

6Readers

Get full text

Abstract

In this paper, we propose an HISNav VQA dataset – a challenging dataset for a Visual Question Answering task that is aimed at the needs of Visual Navigation in human-centered environments. The dataset consists of images of various room scenes that were captured using the Habitat virtual environment and of questions important for navigation tasks using only visual information. We also propose a baseline for a HISNav VQA dataset, a Vector Semiotic Architecture, and demonstrate its performance. The Vector Semiotic Architecture is a combination of a Sign-Based World Model and Vector Symbolic Architectures. The Sign-Based World Model allows representing various aspects of an agent’s knowledge, and Vector Symbolic Architectures serve on a low computational level. The Vector Semiotic Architecture addresses the symbol grounding problem that plays an important role in the Visual Question Answering Task.

Author supplied keywords

Cite

CITATION STYLE

APA

Kirilenko, D. E., Kovalev, A. K., Osipov, E., & Panov, A. I. (2021). Question Answering for Visual Navigation in Human-Centered Environments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13068 LNAI, pp. 31–45). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-89820-5_3

Question Answering for Visual Navigation in Human-Centered Environments

Abstract

Author supplied keywords

Cite

Register to see more suggestions