Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building

Omar Momen; David Arps; Laura Kallmeyer

Conference Proceedings

Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building

CoNLL 2023 - BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, Proceedings (2023) 327-338

DOI: 10.18653/v1/2023.conll-babylm.29

2Citations

12Readers

Get full text

Abstract

In this paper, we describe our submission to the BabyLM Challenge 2023 shared task on data-efficient language model (LM) pretraining (Warstadt et al., 2023). We train transformer-based masked language models that incorporate unsupervised predictions about hierarchical sentence structure into the model architecture. Concretely, we use the Structformer architecture (Shen et al., 2021) and variants thereof. StructFormer models have been shown to perform well on unsupervised syntactic induction based on limited pretraining data and to yield performance improvements over a vanilla transformer architecture (Shen et al., 2021). Evaluation of our models on 39 tasks provided by the BabyLM challenge shows promising improvements of models that integrate a hierarchical bias into the architecture at some particular tasks, even though they fail to consistently outperform the baseline model on all tasks.

Cite

CITATION STYLE

APA

Momen, O., Arps, D., & Kallmeyer, L. (2023). Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building. In CoNLL 2023 - BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, Proceedings (pp. 327–338). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.conll-babylm.29

Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building

Abstract

Cite

Register to see more suggestions