One of the important tasks of Natural Language Processing (NLP) is Named Entity Recognition (NER). The primary operation of NER is to identify proper nouns i.e. to locate all the named entities in the text and tag them as certain named entity categories such as Entity, Time expression and Numeric expression. In the previous works, NER for Telugu language is addressed with Conditional Random Fields (CRF) and Maximum Entropy models however they failed to handle ambiguous named entity tags for the same named entity. This paper presents a hybrid statistical system for Named Entity Recognition in Telugu language in which named entities are identified by both dictionary-based approach and statistical Hidden Markov Model (HMM). The proposed method uses Lexicon-lookup dictionary and contexts based on semantic features for predicting named entity tags. Further HMM is used to resolve the named entity ambiguities in predicted named entity tags. The present work reports an average accuracy of 86.3% for finding the named entities.
CITATION STYLE
Eluri, S., & Lingamgunta, S. (2019). Statistical method for named entity recognition in Telugu, an Indian Language. International Journal of Recent Technology and Engineering, 8(2), 4211–4216. https://doi.org/10.35940/ijrte.B3500.078219
Mendeley helps you to discover research relevant for your work.