How does BERT's attention change when you fine-tune? An analysis methodology and a case study in negation scope

30Citations
Citations of this article
155Readers
Mendeley users who have this article in their library.

Abstract

Large pretrained language models like BERT, after fine-tuning to a downstream task, have achieved high performance on a variety of NLP problems. Yet explaining their decisions is difficult despite recent work probing their internal representations. We propose a procedure and analysis methods that take a hypothesis of how a transformer-based model might encode a linguistic phenomenon, and test the validity of that hypothesis based on a comparison between knowledge-related downstream tasks with downstream control tasks, and measurement of cross-dataset consistency. We apply this methodology to test BERT and RoBERTa on a hypothesis that some attention heads will consistently attend from a word in negation scope to the negation cue. We find that after fine-tuning BERT and RoBERTa on a negation scope task, the average attention head improves its sensitivity to negation and its attention consistency across negation datasets compared to the pre-trained models. However, only the base models (not the large models) improve compared to a control task, indicating there is evidence for a shallow encoding of negation only in the base models.

Cite

CITATION STYLE

APA

Zhao, Y., & Bethard, S. (2020). How does BERT’s attention change when you fine-tune? An analysis methodology and a case study in negation scope. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 4729–4747). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.429

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free