Tagging english by path voting constraints

Gökhan Tür; Kemal Oflazer

Conference Proceedings

Tagging english by path voting constraints

Proceedings of the Annual Meeting of the Association for Computational Linguistics (1998) 2 1277-1281

DOI: 10.3115/980691.980777

1Citations

79Readers

Get full text

Abstract

We describe a constraint-based tagging approach where individual constraint rules vote on sequences of matching tokens and tags. Disambiguation of all tokens in a sentence is performed at the very end by selecting tags that appear on the path that receives the highest vote. This constraint application paradigm makes the outcome of the disambiguation independent of the rule sequence, and hence relieves the rule developer from worrying about potentially conflicting rule sequencing. The approach can also combine statistically and manually obtained constraints, and incorporate negative constraint rules to rule out certain patterns. We have applied this approach to tagging English text from the Wall Street Journal and the Brown Corpora. Our results from the Wall Street Journal Corpus indicate that with 400 statistically derived constraint rules and about 800 hand-crafted constraint rules, we can attain an average accuracy of 97.89% on the training corpus and an average accuracy of 97.50% on the testing corpus. We can also relax the single tag per token limitation and allow ambiguous tagging which lets us trade recall and precision.

Cite

CITATION STYLE

APA

Tür, G., & Oflazer, K. (1998). Tagging english by path voting constraints. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2, pp. 1277–1281). Association for Computational Linguistics (ACL). https://doi.org/10.3115/980691.980777

Tagging english by path voting constraints

Abstract

Cite

Register to see more suggestions