Using chinese glyphs for named entity recognition (student abstract)

19Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Most Named Entity Recognition (NER) systems use additional features like part-of-speech (POS) tags, shallow parsing, gazetteers, etc. Adding these external features to NER systems have been shown to have a positive impact. However, creating gazetteers or taggers can take a lot of time and may require extensive data cleaning. In this work instead of using these traditional features we use lexicographic features of Chinese characters. Chinese characters are composed of graphical components called radicals and these components often have some semantic indicators. We propose CNN based models that incorporate this semantic information and use them for NER. Our models show an improvement over the baseline BERT-BiLSTM-CRF model. We present one of the first studies on Chinese OntoNotes v5.0 and show an improvement of +.64 F1 score over the baseline. We present a state-of-the-art (SOTA) F1 score of 71.81 on the Weibo dataset, show a competitive improvement of +0.72 over baseline on the ResumeNER dataset, and a SOTA F1 score of 96.49 on the MSRA dataset.

Cite

CITATION STYLE

APA

Song, C. H., & Sehanobish, A. (2020). Using chinese glyphs for named entity recognition (student abstract). In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 13921–13922). AAAI press. https://doi.org/10.1609/aaai.v34i10.7233

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free