Diversity-Aware Batch Active Learning for Dependency Parsing

Tianze Shi; Adrian Benton; Igor Malioutov; Ozan İrsoy

Conference ProceedingsOPEN ACCESS

Diversity-Aware Batch Active Learning for Dependency Parsing

NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (2021) 2616-2626

DOI: 10.18653/v1/2021.naacl-main.207

6Citations

72Readers

Abstract

While the predictive performance of modern statistical dependency parsers relies heavily on the availability of expensive expert-annotated treebank data, not all annotations contribute equally to the training of the parsers. In this paper, we attempt to reduce the number of labeled examples needed to train a strong dependency parser using batch active learning (AL). In particular, we investigate whether enforcing diversity in the sampled batches, using determinantal point processes (DPPs), can improve over their diversity-agnostic counterparts. Simulation experiments on an English newswire corpus show that selecting diverse batches with DPPs is superior to strong selection strategies that do not enforce batch diversity, especially during the initial stages of the learning process. Additionally, our diversity-aware strategy is robust under a corpus duplication setting, where diversity-agnostic sampling strategies exhibit significant degradation.

Cite

CITATION STYLE

APA

Shi, T., Benton, A., Malioutov, I., & İrsoy, O. (2021). Diversity-Aware Batch Active Learning for Dependency Parsing. In NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 2616–2626). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.naacl-main.207

Diversity-Aware Batch Active Learning for Dependency Parsing

Abstract

Cite

Register to see more suggestions