Language Model Pre-Training with Sparse Latent Typing

3Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

Abstract

Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at https://github.com/renll/SparseLT.

Cite

CITATION STYLE

APA

Ren, L., Zhang, Z., Wang, H., Voss, C. R., Zhai, C., & Ji, H. (2022). Language Model Pre-Training with Sparse Latent Typing. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 1480–1494). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.96

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free