It has been a common approach to pre-train a language model on a large corpus and fine-tune it on task-specific data. In practice, we observe that fine-tuning a pre-trained model on a small dataset may lead to over- and/or under-estimation problem. In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to underestimated ones. Experiments on a variety of text generation datasets show that MC-Tailor consistently and significantly outperforms the fine-tuning approach. Our code is available at https://github.com/NingMiao/MC-tailor.
CITATION STYLE
Miao, N., Song, Y., Zhou, H., & Li, L. (2020). Do you have the right scissors? Tailoring pre-trained language models via monte-carlo methods. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 3436–3441). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.314
Mendeley helps you to discover research relevant for your work.