Statistical method for named entity recognition in Telugu, an Indian Language

Suneetha Eluri; Sumalatha Lingamgunta

Journal ArticleOPEN ACCESS

Statistical method for named entity recognition in Telugu, an Indian Language

International Journal of Recent Technology and Engineering (2019) 8(2) 4211-4216

DOI: 10.35940/ijrte.B3500.078219

2Citations

8Readers

Get full text

Abstract

One of the important tasks of Natural Language Processing (NLP) is Named Entity Recognition (NER). The primary operation of NER is to identify proper nouns i.e. to locate all the named entities in the text and tag them as certain named entity categories such as Entity, Time expression and Numeric expression. In the previous works, NER for Telugu language is addressed with Conditional Random Fields (CRF) and Maximum Entropy models however they failed to handle ambiguous named entity tags for the same named entity. This paper presents a hybrid statistical system for Named Entity Recognition in Telugu language in which named entities are identified by both dictionary-based approach and statistical Hidden Markov Model (HMM). The proposed method uses Lexicon-lookup dictionary and contexts based on semantic features for predicting named entity tags. Further HMM is used to resolve the named entity ambiguities in predicted named entity tags. The present work reports an average accuracy of 86.3% for finding the named entities.

Author supplied keywords

Cite

CITATION STYLE

APA

Eluri, S., & Lingamgunta, S. (2019). Statistical method for named entity recognition in Telugu, an Indian Language. International Journal of Recent Technology and Engineering, 8(2), 4211–4216. https://doi.org/10.35940/ijrte.B3500.078219

Statistical method for named entity recognition in Telugu, an Indian Language

Abstract

Author supplied keywords

Cite

Register to see more suggestions