Language Model Pre-Training with Sparse Latent Typing

Liliang Ren; Zixuan Zhang; Han Wang; Clare R. Voss; Chengxiang Zhai; Heng Ji

Conference ProceedingsOPEN ACCESS

Language Model Pre-Training with Sparse Latent Typing

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (2022) 1480-1494

DOI: 10.18653/v1/2022.emnlp-main.96

3Citations

23Readers

Abstract

Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at https://github.com/renll/SparseLT.

Cite

CITATION STYLE

APA

Ren, L., Zhang, Z., Wang, H., Voss, C. R., Zhai, C., & Ji, H. (2022). Language Model Pre-Training with Sparse Latent Typing. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 1480–1494). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.96

Language Model Pre-Training with Sparse Latent Typing

Abstract

Cite

Register to see more suggestions