Adaptive Contrastive Knowledge Distillation for BERT Compression

Jinyang Guo; Jiaheng Liu; Zining Wang; Yuqing Ma; Ruihao Gong; Ke Xu; Xianglong Liu

Conference Proceedings

Adaptive Contrastive Knowledge Distillation for BERT Compression

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 8941-8953

DOI: 10.18653/v1/2023.findings-acl.569

17Citations

23Readers

Get full text

Abstract

In this paper, we propose a new knowledge distillation approach called adaptive contrastive knowledge distillation (ACKD) for BERT compression. Different from existing knowledge distillation methods for BERT that implicitly learn discriminative student features by mimicking the teacher features, we first introduce a novel contrastive distillation loss (CDL) based on hidden state features in BERT as the explicit supervision to learn discriminative student features. We further observe sentences with similar features may have completely different meanings, which makes them hard to distinguish. Existing methods do not pay sufficient attention to these hard samples with less discriminative features. Therefore, we propose a new strategy called sample adaptive reweighting (SAR) to adaptively pay more attention to these hard samples and strengthen their discrimination abilities. We incorporate our SAR strategy into our CDL and form the adaptive contrastive distillation loss, based on which we construct our ACKD framework. Comprehensive experiments on multiple natural language processing tasks demonstrate the effectiveness of our ACKD framework.

Cite

CITATION STYLE

APA

Guo, J., Liu, J., Wang, Z., Ma, Y., Gong, R., Xu, K., & Liu, X. (2023). Adaptive Contrastive Knowledge Distillation for BERT Compression. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 8941–8953). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.569

Adaptive Contrastive Knowledge Distillation for BERT Compression

Abstract

Cite

Register to see more suggestions