Self-attention based visual dialogue

Vaibhav Mathur; Divyansh Jha; Sunil Kumar

Journal ArticleOPEN ACCESS

Self-attention based visual dialogue

International Journal of Recent Technology and Engineering (2019) 8(3) 8792-8795

DOI: 10.35940/ijrte.C5306.098319

0Citations

1Readers

Abstract

We improvised the performance on the task of Visual Dialogue. We integrate a novel mechanism called self-attention to improve the results reported in the original Visual Dialogue paper. Visual Dialogue is different from other downstream tasks and serves as a universal test of machine intelligence. The model has to be adroit in both vision and language enough to allow assessment of individual answers and observe the development. The dataset used in this paper is VisDial v0.9 which is collected by Georgia Tech University. We used the same train/test splits as the original paper to estimate the result. It contains a total of approximately 1.2 million dialogue question-answer pair which has ten question-answer pairs on ~120,000 images from COCO.To keep the comparison fair and simple, we have used the encoder-decoder architecture namely Late-Fusion Encoder and Discriminative decoder. We included the self-attention module from SAGAN paper into the encoder. The inclusion self-attention module was based in the fact that a lot of answers from visual dialog model were solely based on the questions asked and not on the image. So, the hypothesis is that the self-attention module will make the model attend to the image while generating an answer.

Cite

CITATION STYLE

APA

Mathur, V., Jha, D., & Kumar, S. (2019). Self-attention based visual dialogue. International Journal of Recent Technology and Engineering, 8(3), 8792–8795. https://doi.org/10.35940/ijrte.C5306.098319

Self-attention based visual dialogue

Abstract

Cite

Register to see more suggestions