Learning to predict population-level label distributions

19Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Machine learning problems are often subjective or ambiguous. That is, humans solving the same problems might come to legitimate but completely different conclusions, based on their personal experiences and beliefs. In supervised learning, particularly when using crowdsourced training data, multiple annotations per data item are usually reduced to a single label representing ground truth. This hides a rich source of diversity and subjectivity of opinions about the labels. Label distribution learning associates for each data item a probability distribution over the labels for that item, thus can preserve the diversity that conventional learning hides or ignores. We introduce a strategy for learning label distributions with only five-to-ten labels per item by aggregating human-annotated labels over multiple, semantically related data items. Our results suggest that specific label aggregation methods can help provide reliable representative semantics at the population level.

Cite

CITATION STYLE

APA

Liu, T., Bongale, P. S., Venkatachalam, A., & Homan, C. M. (2019). Learning to predict population-level label distributions. In The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019 (pp. 1111–1120). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308560.3317082

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free