Losses over Labels: Weakly Supervised Learning via Direct Loss Construction

4Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak supervision is a growing paradigm within machine learning. In this setting, users design heuristics that provide noisy labels for subsets of the data. These weak labels are combined (typically via a graphical model) to form pseudolabels, which are then used to train a downstream model. In this work, we question a foundational premise of the typical weakly supervised learning pipeline: given that the heuristic provides all “label” information, why do we need to generate pseudolabels at all? Instead, we propose to directly transform the heuristics themselves into corresponding loss functions that penalize differences between our model and the heuristic. By constructing losses directly from the heuristics, we can incorporate more information than is used in the standard weakly supervised pipeline, such as how the heuristics make their decisions, which explicitly informs feature selection during training. We call our method Losses over Labels (LoL) as it creates losses directly from heuristics without going through the intermediate step of a label. We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks and further demonstrate that incorporating gradient information leads to better performance on almost every task.

References Powered by Scopus

XGBoost: A scalable tree boosting system

32588Citations
N/AReaders
Get full text

Bagging predictors

19048Citations
N/AReaders
Get full text

The Strength of Weak Learnability

3628Citations
N/AReaders
Get full text

Cited by Powered by Scopus

From Google Gemini to OpenAI Q* (Q-Star): A Survey on Reshaping the Generative Artificial Intelligence (AI) Research Landscape

0Citations
N/AReaders
Get full text

Weakly supervised label learning flows

0Citations
N/AReaders
Get full text

Saliency information and mosaic based data augmentation method for densely occluded object recognition

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Sam, D., & Kolter, J. Z. (2023). Losses over Labels: Weakly Supervised Learning via Direct Loss Construction. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 (Vol. 37, pp. 9695–9703). AAAI Press. https://doi.org/10.1609/aaai.v37i8.26159

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 4

57%

Researcher 2

29%

Professor / Associate Prof. 1

14%

Readers' Discipline

Tooltip

Computer Science 7

78%

Engineering 2

22%

Save time finding and organizing research with Mendeley

Sign up for free