Part-of-speech tagging using decision trees

51Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We have applied inductive learning of statistical decision trees to the Natural Language Processing (NLP) task of morphosyn-tactic disambiguation (Part Of Speech Tagging). Previous work showed that the acquired language models are independent enough to be easily incorporated, as a statistical core of rules, in any flexible tagger. They are also complete enough to be directly used as sets of POS disambiguation rules. We have implemented a quite simple and fast tagger that has been tested and evaluated on the Wall Street Journal (WSJ) corpus with a remarkable accuracy. In this paper we basically address the problem of tagging when only small training material is available, which is crucial in any process of constructing, from scratch, an annotated corpus. We show that quite high accuracy can be achieved with our system in this situation. In addition we also face the problem of dealing with unknown words under the same conditions of lacking training examples. In this case some comparative results and comments about close related work are reported.

Cite

CITATION STYLE

APA

Marquez, L., & Rodriguez, H. (1998). Part-of-speech tagging using decision trees. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1398, pp. 25–36). Springer Verlag. https://doi.org/10.1007/bfb0026668

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free