A semi-supervised associative classification method for POS tagging

11Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We present here a data mining approach for part-of-speech (POS) tagging, an important natural language processing (NLP) task, which is a classification problem. We propose a semi-supervised associative classification method for POS tagging. Existing methods for building POS taggers require extensive domain and linguistic knowledge and resources. Our method uses a combination of a small POS tagged corpus and untagged text data as training data to build the classifier model using association rules. Our tagger works well with very little training data also. The use of semi-supervised learning provides the advantage of not requiring a large high-quality annotated corpus. These properties make it especially suitable for resource-poor languages. Our experiments on various resource-rich, resource-moderate and resource-poor languages show good performance without using any language-specific linguistic information. We note that inclusion of such features in our method may further improve the performance. Results also show that for smaller training data sizes our tagger performs better than state-of-the-art conditional random field (CRF) tagger using same features as our tagger.

Cite

CITATION STYLE

APA

Rani, P., Pudi, V., & Sharma, D. M. (2016). A semi-supervised associative classification method for POS tagging. International Journal of Data Science and Analytics, 1(2), 123–136. https://doi.org/10.1007/s41060-016-0010-5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free