From credit assignment to entropy regularization: Two new algorithms for neural sequence prediction

1Citations
Citations of this article
125Readers
Mendeley users who have this article in their library.

Abstract

In this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. Inspired by the connection, we propose two sequence prediction algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. On two benchmark datasets, we show the proposed algorithms outperform RAML and Actor-Critic respectively, providing new alternatives to sequence prediction.

Cite

CITATION STYLE

APA

Dai, Z., Xie, Q., & Hovy, E. (2018). From credit assignment to entropy regularization: Two new algorithms for neural sequence prediction. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (Vol. 1, pp. 1672–1682). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-1155

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free