KILM: Knowledge Injection into Encoder-Decoder Language Models

1Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.

Abstract

Large pre-trained language models (PLMs) have been shown to retain implicit knowledge within their parameters. To enhance this implicit knowledge, we propose Knowledge Injection into Language Models (KILM), a novel approach that injects entity-related knowledge into encoder-decoder PLMs, via a generative knowledge infilling objective through continued pre-training. This is done without architectural modifications to the PLMs or adding additional parameters. Experimental results over a suite of knowledge-intensive tasks spanning numerous datasets show that KILM enables models to retain more knowledge and hallucinate less while preserving their original performance on general NLU and NLG tasks. KILM also demonstrates improved zero-shot performances on tasks such as entity disambiguation, outperforming state-of-the-art models having 30x more parameters.

Cite

CITATION STYLE

APA

Xu, Y., Namazifar, M., Hazarika, D., Padmakumar, A., Liu, Y., & Hakkani-Tür, D. (2023). KILM: Knowledge Injection into Encoder-Decoder Language Models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 5013–5035). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-long.275

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free