Gazetteer is widely used in Chinese named entity recognition (NER) to enhance span boundary detection and type classification. However, to further understand the generalizability and effectiveness of gazetteers, the NLP community still lacks a systematic analysis of the gazetteer-enhanced NER model. In this paper, we first re-examine the effectiveness of several common practices of the gazetteer-enhanced NER models and carry out a series of detailed analyses to evaluate the relationship between the model performance and the gazetteer characteristics, which can guide us to build a more suitable gazetteer. The findings of this paper are as follows: (1) the gazetteer has a positive impact on the NER model in most situations. (2) the performance of the NER model greatly benefits from the high-quality pre-trained lexeme embeddings. (3) a good gazetteer should cover more entities that can be matched in both the training set and testing set.
CITATION STYLE
Chen, Q., Zeng, X., Zhu, J., Zhang, Y., Lin, B., Yang, Y., & Jiang, D. (2022). Rethinking the Value of Gazetteer in Chinese Named Entity Recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13551 LNAI, pp. 285–297). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-17120-8_23
Mendeley helps you to discover research relevant for your work.