Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued. In this paper, we build a sys-tem capable of answering open-ended text-based questions about images, which is known as Visual Question Answer-ing (VQA). Our approach's key insight is that we can pre-dict the form of the answer from the question. We formu-late our solution in a Bayesian framework. When our ap-proach is combined with a discriminative model, the com-bined model achieves state-of-the-art results on four bench-mark datasets for open-ended VQA: DAQUAR, COCO-QA, The VQA Dataset, and Visual7W.
CITATION STYLE
Clarke, G., Reynders, D., & Wright, E. (2003). Fundamentals of SCADA communications. In Practical Modern SCADA Protocols (pp. 12–62). Elsevier. https://doi.org/10.1016/b978-075065799-0/50018-8
Mendeley helps you to discover research relevant for your work.