Joint energy-based model training for better calibrated natural language understanding models

Tianxing He; Bryan McCann; Caiming Xiong; Ehsan Hosseini-Asl

Conference Proceedings

Joint energy-based model training for better calibrated natural language understanding models

EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (2021) 1754-1761

DOI: 10.18653/v1/2021.eacl-main.151

10Citations

71Readers

Get full text

Abstract

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

Cite

CITATION STYLE

APA

He, T., McCann, B., Xiong, C., & Hosseini-Asl, E. (2021). Joint energy-based model training for better calibrated natural language understanding models. In EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 1754–1761). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.eacl-main.151

Joint energy-based model training for better calibrated natural language understanding models

Abstract

Cite

Register to see more suggestions