Words unknown to the lexicon present a substantial problem to part-of-speech tagging. In this paper we present a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words. Three complementary sets of word-guessing rules are induced from the lexicon and a raw corpus: prefix morphological rules, suffix morphological rules and ending-guessing rules. The learning was performed on the Brown Corpus data and rule-sets, with a highly competitive performance, were produced and compared with the state-of-the-art.
CITATION STYLE
Mikheev, A. (1996). Unsupervised learning of word-category guessing rules. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1996-June, pp. 327–334). Association for Computational Linguistics (ACL). https://doi.org/10.3115/981863.981906
Mendeley helps you to discover research relevant for your work.