Deconstructing multimodality: Visual properties and visual context in human semantic processing

4Citations
Citations of this article
67Readers
Mendeley users who have this article in their library.

Abstract

Multimodal semantic models that extend linguistic representations with additional perceptual input have proved successful in a range of natural language processing (NLP) tasks. Recent research has successfully used neural methods to automatically create visual representations for words. However, these works have extracted visual features from complete images, and have not examined how different kinds of visual information impact performance. In contrast, we construct multimodal models that differentiate between internal visual properties of the objects and their external visual context. We evaluate the models on the task of decoding brain activity associated with the meanings of nouns, demonstrating their advantage over those based on complete images.

Cite

CITATION STYLE

APA

Davis, C., Bulat, L., Vero, A., & Shutova, E. (2019). Deconstructing multimodality: Visual properties and visual context in human semantic processing. In *SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics (pp. 118–124). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/s19-1013

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free