Using Roark-Hollingshead Distance to Probe BERT’s Syntactic Competence

3Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

Probing BERT’s general ability to reason about syntax is no simple endeavour, primarily because of the uncertainty surrounding how large language models represent syntactic structure. Many prior accounts of BERT’s agility as a syntactic tool (Clark et al., 2013; Lau et al., 2014; Marvin and Linzen, 2018; Chowdhury and Zamparelli, 2018; Warstadt et al., 2019, 2020; Hu et al., 2020) have therefore confined themselves to studying very specific linguistic phenomena, and there has still been no definitive answer as to whether BERT “knows” syntax. The advent of perturbed masking (Wu et al., 2020) would then seem to be significant, because this is a parameter-free probing method that directly samples syntactic trees from BERT’s embeddings. These sampled trees outperform a right-branching baseline, thus providing preliminary evidence that BERT’s syntactic competence bests a simple baseline. This baseline is underwhelming, however, and our reappraisal below suggests that this result, too, is inconclusive. We propose RH Probe, an encoder-decoder probing architecture that operates on two probing tasks. We find strong empirical evidence confirming the existence of important syntactic information in BERT, but this information alone appears not to be enough to reproduce syntax in its entirety. Our probe makes crucial use of a conjecture made by Roark and Hollingshead (2008) that a particular lexical annotation that we shall call RH distance is a sufficient encoding of unlabelled binary syntactic trees, and we prove this conjecture.

Cite

CITATION STYLE

APA

Niu, J., Lu, W., Corlett, E., & Penn, G. (2022). Using Roark-Hollingshead Distance to Probe BERT’s Syntactic Competence. In BlackboxNLP 2022 - BlackboxNLP Analyzing and Interpreting Neural Networks for NLP, Proceedings of the Workshop (pp. 335–345). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.blackboxnlp-1.27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free