DReCa: A General Task Augmentation Strategy for Few-Shot Natural Language Inference

18Citations
Citations of this article
102Readers
Mendeley users who have this article in their library.

Abstract

Meta-learning promises few-shot learners that quickly adapt to new distributions by repurposing knowledge acquired from previous training. However, we believe meta-learning has not yet succeeded in NLP due to the lack of a well-defined task distribution, leading to attempts that treat datasets as tasks. Such an ad hoc task distribution causes problems of quantity and quality. Since there’s only a handful of datasets for any NLP problem, meta-learners tend to overfit their adaptation mechanism and, since NLP datasets are highly heterogeneous, many learning episodes have poor transfer between their support and query sets, which discourages the meta-learner from adapting. To alleviate these issues, we propose DRECA (Decomposing datasets into Reasoning Categories), a simple method for discovering and using latent reasoning categories in a dataset, to form additional high quality tasks. DRECA works by splitting examples into label groups, embedding them with a finetuned BERT model and then clustering each group into reasoning categories. Across four few-shot NLI problems, we demonstrate that using DRECA improves the accuracy of meta-learners by 1.5–4%.

Cite

CITATION STYLE

APA

Murty, S., Hashimoto, T. B., & Manning, C. D. (2021). DReCa: A General Task Augmentation Strategy for Few-Shot Natural Language Inference. In NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 1113–1125). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.naacl-main.88

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free