Recognizing jumbled images: The role of local and global information in image classification

  • Parikh D
  • 14

    Readers

    Mendeley users who have this article in their library.
  • 30

    Citations

    Citations of this article.

Abstract

The performance of current state-of-the-art computer vision algorithms at image classification falls significantly short as compared to human abilities. To reduce this gap, it is important for the community to know what problems to solve, and not just how to solve them. Towards this goal, via the use of jumbled images, we strip apart two widely investigated aspects: local and global information in images, and identify the performance bottleneck. Interestingly, humans have been shown to reliably recognize jumbled images. The goal of our paper is to determine a functional model that mimics how humans recognize jumbled images i.e. exploit local information alone, and further evaluate if existing implementations of this computational model suffice to match human performance. Surprisingly, in our series of human studies and machine experiments, we find that a simple bag-of-words based majority-vote-like strategy is an accurate functional model of how humans recognize jumbled images. Moreover, a straightforward machine implementation of this model achieves accuracies similar to human subjects at classifying jumbled images. This indicates that perhaps existing machine vision techniques already leverage local information from images effectively, and future research efforts should be focused on more advanced modeling of global information.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Devi Parikh

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free