Unsupervised methods for developing taxonomies by combining syntactic and statistical information

77Citations
Citations of this article
131Readers
Mendeley users who have this article in their library.

Abstract

This paper describes an unsupervised algorithm for placing unknown words into a taxonomy and evaluates its accuracy on a large and varied sample of words. The algorithm works by first using a large corpus to find semantic neighbors of the unknown word, which we accomplish by combining latent semantic analysis with part-of-speech information. We then place the unknown word in the part of the taxonomy where these neighbors are most concentrated, using a class-labelling algorithm developed especially for this task. This method is used to reconstruct parts of the existing WordNet database, obtaining results for common nouns, proper nouns and verbs. We evaluate the contribution made by part-of-speech tagging and show that automatic filtering using the class-labelling algorithm gives a fourfold improvement in accuracy.

Cite

CITATION STYLE

APA

Widdows, D. (2003). Unsupervised methods for developing taxonomies by combining syntactic and statistical information. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, HLT-NAACL 2003. Association for Computational Linguistics (ACL). https://doi.org/10.3115/1073445.1073481

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free