Objective: This paper aimed to review corpus linguistics sources related to part-of-speech tagging and to build up a sufficient annotated corpus for the Arabic language that contains Arabic words and their grammatical tags. Methods/Statistical Analysis: An in-depth survey conducted by the author's showed that there is a need for free tagged Arabic corpus that can be used in natural language processing researches. A corpus of 25,000 words collected manually from different web sources which ware written in Modern Standard Arabic. The collected words were tagged using Arabic language grammar books. Findings: The developed corpus can help the researchers in natural language processing applications. Applications/Improvements: This corpus needed to be expanded to include more words and their grammatical tags.
CITATION STYLE
Abumalloh, R. A., Al-Sarhan, H. M., & Abu-Ulbeh, W. (2016). Building Arabic corpus applied to part-of-speech tagging. Indian Journal of Science and Technology, 9(46). https://doi.org/10.17485/ijst/2016/v9i46/107110
Mendeley helps you to discover research relevant for your work.