Learning to predict population-level label distributions

Tong Liu; Pratik Sanjay Bongale; Akash Venkatachalam; Christopher M. Homan

Conference ProceedingsOPEN ACCESS

Learning to predict population-level label distributions

The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019 (2019) 1111-1120

DOI: 10.1145/3308560.3317082

20Citations

18Readers

Get full text

Abstract

Machine learning problems are often subjective or ambiguous. That is, humans solving the same problems might come to legitimate but completely different conclusions, based on their personal experiences and beliefs. In supervised learning, particularly when using crowdsourced training data, multiple annotations per data item are usually reduced to a single label representing ground truth. This hides a rich source of diversity and subjectivity of opinions about the labels. Label distribution learning associates for each data item a probability distribution over the labels for that item, thus can preserve the diversity that conventional learning hides or ignores. We introduce a strategy for learning label distributions with only five-to-ten labels per item by aggregating human-annotated labels over multiple, semantically related data items. Our results suggest that specific label aggregation methods can help provide reliable representative semantics at the population level.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, T., Bongale, P. S., Venkatachalam, A., & Homan, C. M. (2019). Learning to predict population-level label distributions. In The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019 (pp. 1111–1120). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308560.3317082

Learning to predict population-level label distributions

Abstract

Author supplied keywords

Cite

Register to see more suggestions