Abstract
Recent advances, such as GPT and BERT, have shown success in incorporating a pre-trained transformer language model and fine-tuning operation to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story ending prediction as the target task to conduct experiments. The final result, an accuracy of 91.8%, dramatically outperforms previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks to improve BERT.
Cite
CITATION STYLE
Li, Z., Ding, X., & Liu, T. (2019). Story ending prediction by transferable BERT. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 1800–1806). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/249
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.