SIDECONTROL: Controlled Open-domain Dialogue Generation via Additive Side Networks

3Citations
Citations of this article
58Readers
Mendeley users who have this article in their library.

Abstract

Transformer-based pre-trained language models boost the performance of open-domain dialogue systems. Prior works leverage Transformer-based pre-trained language models to generate texts with desired attributes in two general approaches: (1) gradient-based methods: updating all latent representations of pre-trained models with gradients from attribute models; (2) weighted-decoding methods: re-ranking beam candidates from pretrained models with attribute functions. However, gradient-based methods lead to high computation cost and can easily get overfitted on small training sets, while weighted-decoding methods are inherently constrained by the lowvariance high-bias pre-trained model. In this work, we propose a novel approach to control the generation of Transformer-based pretrained language models: the SIDECONTROL framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples. We evaluate our proposed method on two benchmark open-domain dialogue datasets, and results show that the SIDECONTROL framework has better controllability, higher generation quality and better sample-efficiency than existing gradient-based and weighted-decoding baselines.

Cite

CITATION STYLE

APA

Du, W., & Ji, Y. (2021). SIDECONTROL: Controlled Open-domain Dialogue Generation via Additive Side Networks. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 2175–2194). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.188

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free