Visual attention plays a central role in natural and artificial systems to control perceptual resources. The classic artificial visual attention systems uses salient features of the image obtained from the information given by predefined filters. Recently, deep neural networks have been developed for recognizing thousands of objects and autonomously generate visual characteristics optimized by training with large data sets. Besides being used for object recognition, these features have been very successful in other visual problems such as object segmentation, tracking and recently, visual attention. In this work we propose a biologically inspired object classification and localization framework that combines Deep Convolutional Neural Networks with foveal vision. First, a feed-forward pass is performed to obtain the predicted class labels. Next, we get the object location proposals by applying a segmentation mask on the saliency map calculated through a top-down backward pass. The main contribution of our work lies in the evaluation of the performances obtained with different non-uniform resolutions. We were able to establish a relationship between performance and the different levels of information preserved by each of the sensing configurations. The results demonstrate that we do not need to store and transmit all the information present on high-resolution images since, beyond a certain amount of preserved information, the performance in the classification and localization task saturates.
CITATION STYLE
Almeida, A. F., Figueiredo, R., Bernardino, A., & Santos-Victor, J. (2018). Deep Networks for Human Visual Attention: A Hybrid Model Using Foveal Vision. In Advances in Intelligent Systems and Computing (Vol. 694, pp. 117–128). Springer Verlag. https://doi.org/10.1007/978-3-319-70836-2_10
Mendeley helps you to discover research relevant for your work.