Arabic Named Entity Recognition Using Clustered Word Embedding

Caroline Sabty; Mohamed Elmahdy; Slim Abdennadher

Conference Proceedings

Arabic Named Entity Recognition Using Clustered Word Embedding

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 13396 LNCS 41-49

DOI: 10.1007/978-3-031-23793-5_4

0Citations

3Readers

Get full text

Abstract

Named Entity Recognition in Arabic is a challenging topic because of morphological and lexical richness of Arabic. In this paper, we propose an Arabic NER system that is based on word embedding. Word embedding hold semantic information about the context of the words. We hypothesized that the integration of word embedding features to the conventional lexical and contextual features could improve Arabic NER performance. The Conditional Random Field (CRF) sequence classifier was used. Since most CRF implementations only support categorical features, continuous word embedding vectors are clustered. In this paper, we are mainly investigating the effect of the number of clusters on NER performance. Moreover, the combination of fine and coarse grained clusters has resulted in further recognition improvement.

Author supplied keywords

Cite

CITATION STYLE

APA

Sabty, C., Elmahdy, M., & Abdennadher, S. (2023). Arabic Named Entity Recognition Using Clustered Word Embedding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13396 LNCS, pp. 41–49). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-23793-5_4

Arabic Named Entity Recognition Using Clustered Word Embedding

Abstract

Author supplied keywords

Cite

Register to see more suggestions