CRF models for tamil part of speech tagging and chunking

19Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Conditional random fields (CRFs) is a framework for building probabilistic models to segment and label sequence data. CRFs offer several advantages over hidden Markov models (HMMs) and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. CRFs also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. In this paper we propose the Language Models developed for Part Of Speech (POS) tagging and chunking using CRFs for Tamil. The Language models are designed based on morphological information. The CRF based POS tagger has an accuracy of about 89.18%, for Tamil and the chunking process performs at an accuracy of 84.25% for the same language. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Pandian, S. L., & Geetha, T. V. (2009). CRF models for tamil part of speech tagging and chunking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5459 LNAI, pp. 11–22). https://doi.org/10.1007/978-3-642-00831-3_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free