We show first that the pooling of multiple human judgments of relevance provides a predictor of relevance that is superior to that obtained from a single human's relevance judgments. A learning algorithm applied to a set of relevance judgments obtained from a single human would be expected to perform on new material at a level somewhat below that human. However, we examine two learning methods which when trained on the superior source of pooled human relevance judgments are able to perform at the level of a single human on new material. All performance comparisons are based on an independent human judge. Both algorithms function by producing term weights - one by a log odds calculation and the other by producing a least-squares fit to human relevance ratings. Some characteristics of the algorithms are examined. © 1998 ACM.
CITATION STYLE
Wilbur, W. J. (1998). The knowledge in multiple human relevance judgments. ACM Transactions on Information Systems, 16(2), 101–126. https://doi.org/10.1145/279339.279340
Mendeley helps you to discover research relevant for your work.