What’s the point: Semantic segmentation with point supervision

  • Bearman A
  • Russakovsky O
  • Ferrari V
 et al. 
  • 163


    Mendeley users who have this article in their library.
  • 14


    Citations of this article.


The semantic image segmentation task presents a trade-off between test time accuracy and training-time annotation cost. Detailed per-pixel annotations enable training accurate models but are very time-consuming to obtain, image-level class labels are an order of magnitude cheaper but result in less accurate models. We take a natural step from image-level annotation towards stronger supervision: we ask annotators to point to an object if one exists. We incorporate this point supervision along with a novel objectness potential in the training loss function of a CNN model. Experimental results on the PASCAL VOC 2012 benchmark reveal that the combined effect of point-level supervision and objectness potential yields an improvement of 12.9% mIOU over image-level supervision. Further, we demonstrate that models trained with point-level supervision are more accurate than models trained with image-level, squiggle-level or full supervision given a fixed annotation budget.

Author-supplied keywords

  • Data annotation
  • Semantic segmentation
  • Weak supervision

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Amy Bearman

  • Olga Russakovsky

  • Vittorio Ferrari

  • Li Fei-Fei

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free