Automatic Rule Induction for Efficient Semi-Supervised Learning

2Citations
Citations of this article
24Readers
Mendeley users who have this article in their library.

Abstract

Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data. Meanwhile, pretrained transformer models act as black-box correlation engines that are difficult to explain and sometimes behave unreliably. In this paper, we propose tackling both of these challenges via Automatic Rule Induction (ARI), a simple and general-purpose framework for the automatic discovery and integration of symbolic rules into pretrained transformer models. First, we extract weak symbolic rules from low-capacity machine learning models trained on small amounts of labeled data. Next, we use an attention mechanism to integrate these rules into high-capacity pretrained transformer models. Last, the rule-augmented system becomes part of a self-training framework to boost supervision signal on unlabeled data. These steps can be layered beneath a variety of existing weak supervision and semi-supervised NLP algorithms in order to improve performance and interpretability. Experiments across nine sequence classification and relation extraction tasks suggest that ARI can improve state-of-the-art methods with no manual effort and minimal computational overhead.

Cite

CITATION STYLE

APA

Pryzant, R., Yang, Z., Xu, Y., Zhu, C., & Zeng, M. (2022). Automatic Rule Induction for Efficient Semi-Supervised Learning. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 28–44). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free