ULF: Unsupervised Labeling Function Correction using Cross-Validation for Weak Supervision

0Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A cost-effective alternative to manual data labeling is weak supervision (WS), where data samples are automatically annotated using a predefined set of labeling functions (LFs), rule-based mechanisms that generate artificial labels for the associated classes. In this work, we investigate noise reduction techniques for WS based on the principle of k-fold cross-validation. We introduce a new algorithm ULF for Unsupervised Labeling Function correction, which denoises WS data by leveraging models trained on all but some LFs to identify and correct biases specific to the held-out LFs. Specifically, ULF refines the allocation of LFs to classes by re-estimating this assignment on highly reliable cross-validated samples. Evaluation on multiple datasets confirms ULF's effectiveness in enhancing WS learning without the need for manual labeling.

Cite

CITATION STYLE

APA

Sedova, A., & Roth, B. (2023). ULF: Unsupervised Labeling Function Correction using Cross-Validation for Weak Supervision. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 4162–4176). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.254

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free