Abstract
In this study, we describe our submission to the 2023 BabyLM shared-task’s strict-small track. Our findings demonstrate the feasibility of training high-performing models within the constraints of limited data, computational resources, and time. We provide evidence that the formatting of input can significantly impact downstream performance. Furthermore, the induction of structural biases into the models through the use of part-of-speech trees yields modest benefits. Our most successful model achieves 79% on the BLiMP evaluations and 72% on the SuperGLUE evaluations. All models trained during this study can be found at https://huggingface.co/mcgillbabylm.
Cite
CITATION STYLE
Cheng, Z., Aralikatte, R., Porada, I., Piano, C. S. D., & Kit Cheung, J. C. (2023). McGill BabyLM Shared Task Submission: The Effects of Data Formatting and Structural Biases. In CoNLL 2023 - BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, Proceedings (pp. 207–220). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.conll-babylm.18
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.