Glyph Enhanced Chinese Character Pre-Training for Lexical Sememe Prediction

Boer Lyu; Lu Chen; Kai Yu

Conference ProceedingsOPEN ACCESS

Glyph Enhanced Chinese Character Pre-Training for Lexical Sememe Prediction

Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (2021) 4549-4555

DOI: 10.18653/v1/2021.findings-emnlp.386

9Citations

42Readers

Abstract

Sememes are defined as the atomic units to describe the semantic meaning of concepts. Due to the difficulty of manually annotating sememes and the inconsistency of annotations between experts, the lexical sememe prediction task has been proposed. However, previous methods heavily rely on word or character embeddings, and ignore the fine-grained information. In this paper, we propose a novel pre-training method which is designed to better incorporate the internal information of Chinese character. The Glyph enhanced Chinese Character representation (GCC) is used to assist sememe prediction. We experiment and evaluate our model on HowNet, which is a famous sememe knowledge base. The experimental results show that our method outperforms existing non-external information models.

Cite

CITATION STYLE

APA

Lyu, B., Chen, L., & Yu, K. (2021). Glyph Enhanced Chinese Character Pre-Training for Lexical Sememe Prediction. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 4549–4555). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.386

Glyph Enhanced Chinese Character Pre-Training for Lexical Sememe Prediction

Abstract

Cite

Register to see more suggestions