Semantic reanalysis of scene words in visual question answering

Shiling Jiang; Ming Ma; Jianming Wang; Jiayu Liang; Kunliang Liu; Yukuan Sun; Wei Deng; Siyu Ren; Guanghao Jin

Conference Proceedings

Semantic reanalysis of scene words in visual question answering

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11857 LNCS 468-479

DOI: 10.1007/978-3-030-31654-9_40

0Citations

2Readers

Get full text

Abstract

Visual Question Answering (VQA) is a joint task that aims to answer questions based on the given images. The correct analysis of multiple album aggregate issues to remain a key issue in the VQA case, especially when answering question from multiple albums, how to correctly understand album images and corresponding question is an urgent problem. Under the influence of multiple photo albums and the presence of scene words in the question, it may lead to understanding the wrong scene and outputting the wrong answer, resulting in a decrease in VQA performance. In order to solve this problem, this paper proposes a new image and sentence similarity matching model, which outputs the correct image representation by learning the semantic concept. Due to the scene word is not an entity, sometimes the information which the model extracted may be incorrect. Therefore, we can try to reanalyse the question in another different way and give the answer by the similarity between the question and the visual-text. Our model was tested on the MemexQA dataset. The experimental results show that our model not only produces meaningful text sentences to prove the correctness of the answer, but also improves the accuracy by nearly 10%.

Author supplied keywords

Cite

CITATION STYLE

APA

Jiang, S., Ma, M., Wang, J., Liang, J., Liu, K., Sun, Y., … Jin, G. (2019). Semantic reanalysis of scene words in visual question answering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11857 LNCS, pp. 468–479). Springer. https://doi.org/10.1007/978-3-030-31654-9_40

Semantic reanalysis of scene words in visual question answering

Abstract

Author supplied keywords

Cite

Register to see more suggestions