The impact of morphological stemming on Arabic mention detection and coreference resolution

36Citations
Citations of this article
88Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Arabic presents an interesting challenge to natural language processing, being a highly inflected and agglutinative language. In particular, this paper presents an in-depth investigation of the entity detection and recognition (EDR) task for Arabic. We start by highlighting why segmentation is a necessary prerequisite for EDR, continue by presenting a finite-state statistical segmenter, and then examine how the resulting segments can be better included into a mention detection system and an entity recognition system; both systems are statistical, build around the maximum entropy principle. Experiments on a clearly stated partition of the ACE 2004 data show that stem-based features can significantly improve the performance of the EDT system by 2 absolute F-measure points. The system presented here had a competitive performance in the ACE 2004 evaluation.

Cite

CITATION STYLE

APA

Zitouni, I., Sorensen, J., Luo, X., & Florian, R. (2005). The impact of morphological stemming on Arabic mention detection and coreference resolution. In Computational Approaches to Semitic Languages Workshop Proceedings, SEMITIC@ACL 2005 (pp. 63–70). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1621787.1621800

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free