Meaning maps detect the removal of local semantic scene content but deep saliency models do not

3Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Meaning mapping uses human raters to estimate different semantic features in scenes, and has been a useful tool in demonstrating the important role semantics play in guiding attention. However, recent work has argued that meaning maps do not capture semantic content, but like deep learning models of scene attention, represent only semantically-neutral image features. In the present study, we directly tested this hypothesis using a diffeomorphic image transformation that is designed to remove the meaning of an image region while preserving its image features. Specifically, we tested whether meaning maps and three state-of-the-art deep learning models were sensitive to the loss of semantic content in this critical diffeomorphed scene region. The results were clear: meaning maps generated by human raters showed a large decrease in the diffeomorphed scene regions, while all three deep saliency models showed a moderate increase in the diffeomorphed scene regions. These results demonstrate that meaning maps reflect local semantic content in scenes while deep saliency models do something else. We conclude the meaning mapping approach is an effective tool for estimating semantic content in scenes.

Cite

CITATION STYLE

APA

Hayes, T. R., & Henderson, J. M. (2022). Meaning maps detect the removal of local semantic scene content but deep saliency models do not. Attention, Perception, and Psychophysics, 84(3), 647–654. https://doi.org/10.3758/s13414-021-02395-x

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free