Joint energy-based model training for better calibrated natural language understanding models

10Citations
Citations of this article
71Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

Cite

CITATION STYLE

APA

He, T., McCann, B., Xiong, C., & Hosseini-Asl, E. (2021). Joint energy-based model training for better calibrated natural language understanding models. In EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 1754–1761). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.eacl-main.151

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free