Visual dialog with multi-turn attentional memory network

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Visual dialog is a task of answering a question given an input image, a historical dialog about the image and often requires to retrieve visual and textual facts about the question. This problem is different from visual question answering (VQA), which only relies on visual grounding estimated from an image and question pair, while visual dialog task requires interactions among a question, an input image and a historical dialog. Most methods rely on one-turn attention network to obtain facts w.r.t. a question. However, the information transition phenomenon which exists in these facts restricts these methods to retrieve all relevant information. In this paper, we propose a multi-turn attentional memory network for visual dialog. Firstly, we propose a attentional memory network that maintains image regions and historical dialog in two memory banks and attends the question to be answered to both the visual and textual banks to obtain multi-model facts. Further, considering the information transition phenomenon, we design a multi-turn attention architecture which attend to memory banks multiple turns to retrieve more facts in order to produce a better answer. We evaluate the proposed model in on VisDial v0.9 dataset and the experimental results prove the effectiveness of the proposed model.

Cite

CITATION STYLE

APA

Kong, D., & Wu, F. (2018). Visual dialog with multi-turn attentional memory network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11164 LNCS, pp. 611–621). Springer Verlag. https://doi.org/10.1007/978-3-030-00776-8_56

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free