The Role of Unlabeled Data in Supervised Learning

Tom M. Mitchell

Book Chapter

The Role of Unlabeled Data in Supervised Learning

Mitchell T

Springer Netherlands, (2004), 103-111

DOI: 10.1007/978-1-4020-2783-3_7

N/ACitations

71Readers

Get full text

Abstract

This paper we consider the potential role of unlabeled data in supervised learning. We present an algorithm and experimental results demonstrating that unlabeled data can significantly improve learning accuracy in certain practical problems. We then identify the abstract problem structure that enables the algorithm to successfully utilize this unlabeled data, and prove that unlabeled data will boost learning accuracy for problems in this class. The problem class we identify includes problems where the features describing the examples are redundantly sufficient for classifying the example; a notion we make precise in the paper. This problem class includes many natural learning problems faced by humans, such as learning a semantic lexicon over noun phrases in natural language, and learning to recognize objects from multiple sensor inputs. We argue that models of human and animal learning should consider more strongly the potential role of unlabeled data, and that many natural learning problems fit the class we identify.

Cite

CITATION STYLE

APA

Mitchell, T. M. (2004). The Role of Unlabeled Data in Supervised Learning. In Language, Knowledge, and Representation (pp. 103–111). Springer Netherlands. https://doi.org/10.1007/978-1-4020-2783-3_7

The Role of Unlabeled Data in Supervised Learning

Abstract

Cite

Register to see more suggestions