Pretraining sentiment classifiers with unlabeled dialog data

5Citations
Citations of this article
118Readers
Mendeley users who have this article in their library.

Abstract

The huge cost of creating labeled training data is a common problem for supervised learning tasks such as sentiment classification. Recent studies showed that pretraining with unlabeled data via a language model can improve the performance of classification models. In this paper, we take the concept a step further by using a conditional language model, instead of a language model. Specifically, we address a sentiment classification task for a tweet analysis service as a case study and propose a pretraining strategy with unlabeled dialog data (tweet-reply pairs) via an encoder-decoder model. Experimental results show that our strategy can improve the performance of sentiment classifiers and outperform several state-of-the-art strategies including language model pretraining.

Cite

CITATION STYLE

APA

Shimizu, T., Kobayashi, H., & Shimizu, N. (2018). Pretraining sentiment classifiers with unlabeled dialog data. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (Vol. 2, pp. 764–770). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-2121

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free