Named-Entity-Recognition (NER) for tamil language using Margin-Infused Relaxed Algorithm (MIRA)

Pranavan Theivendiram; Megala Uthayakumar; Nilusija Nadarasamoorthy; Mokanarangan Thayaparan; Sanath Jayasena; Gihan Dias; Surangika Ranathunga

Conference Proceedings

Named-Entity-Recognition (NER) for tamil language using Margin-Infused Relaxed Algorithm (MIRA)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 9623 LNCS 465-476

DOI: 10.1007/978-3-319-75477-2_33

1Citations

7Readers

Get full text

Abstract

Named-Entity-Recognition (NER) is widely used as a foundation for Natural Language Processing (NLP) applications. There have been few previous attempts on building generic NER systems for Tamil language. These attempts were based on machine-learning approaches such as Hidden Markov Models (HMM), Maximum Entropy Markov Models (MEMM), Support Vector Machine (SVM) and Conditional Random Fields (CRF). Among them, CRF has been proven to be the best with respect to the accuracy of NER in Tamil. This paper presents a novel approach to build a Tamil NER system using the Margin-Infused Relaxed Algorithm (MIRA). We also present a comparison of performance between MIRA and CRF algorithms for Tamil NER. When the gazetteer, POS tags and orthographic features are used with the MIRA algorithm, it attains an F1-measure of 81.38% on the Tamil BBC news data whereas the CRF algorithm shows only an F1-measure of 79.13% for the same set of features. Our NER system outperforms all the previous NER systems for Tamil language.

Author supplied keywords

Cite

CITATION STYLE

APA

Theivendiram, P., Uthayakumar, M., Nadarasamoorthy, N., Thayaparan, M., Jayasena, S., Dias, G., & Ranathunga, S. (2018). Named-Entity-Recognition (NER) for tamil language using Margin-Infused Relaxed Algorithm (MIRA). In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9623 LNCS, pp. 465–476). Springer Verlag. https://doi.org/10.1007/978-3-319-75477-2_33

Named-Entity-Recognition (NER) for tamil language using Margin-Infused Relaxed Algorithm (MIRA)

Abstract

Author supplied keywords

Cite

Register to see more suggestions