Tagging Urdu Sentences from English POS Taggers

  • Naseem A
  • Anwar M
  • Ahmed S
  • et al.
N/ACitations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Being a global language, English has attracted a majority of researchers and academia to work on several Natural Language Processing (NLP) applications. The rest of the languages are not focused as much as English. Part-of-speech (POS) Tagging is a necessary component for several NLP applications. An accurate POS Tagger for a particular language is not easy to construct due to the diversity of that language. The global language English, POS Taggers are more focused and widely used by the researchers and academia for NLP processing. In this paper, an idea of reusing English POS Taggers for tagging non-English sentences is proposed. On exemplary basis, Urdu sentences are processed to tagged from 11 famous English POS Taggers. State-of-the-art English POS Taggers were explored from the literature, however, 11 famous POS Taggers were being input to Urdu sentences for tagging. A famous Google translator is used to translate the sentences across the languages. Data from twitter.com is extracted for evaluation perspective. Confusion matrix with kappa statistic is used to measure the accuracy of actual Vs predicted tagging. The two best English POS Taggers which tagged Urdu sentences were Stanford POS Tagger and MBSP POS Tagger with an accuracy of 96.4% and 95.7%, respectively. The system can be generalized for multilingual sentence tagging.

Cite

CITATION STYLE

APA

Naseem, A., Anwar, M., Ahmed, S., & Akhtar, Q. (2017). Tagging Urdu Sentences from English POS Taggers. International Journal of Advanced Computer Science and Applications, 8(10). https://doi.org/10.14569/ijacsa.2017.081030

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free