CHOOSING A DISTANCE METRIC FOR AUTOMATIC WORD CATEGORIZATION

3Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper analyzes the functionality of different distance metrics that can be used in a bottom-up unsupervised algorithm for automatic word categorization. The proposed method uses a modified greedy-type algorithm. The formulations of fuzzy theory are also used to calculate the degree of membership for the elements in the linguistic clusters formed. The unigram and the bigram statistics of a corpus of about two million words are used. Empirical comparisons are made in order to support the discussions proposed for the type of distance metric that would be most suitable for measuring the similarity between linguistic elements.

Cite

CITATION STYLE

APA

Korkmaz, E. E., & Ücoluk, G. (1998). CHOOSING A DISTANCE METRIC FOR AUTOMATIC WORD CATEGORIZATION. In Proceedings of the Joint Conference on New Methods in Language Processing and Computational Natural Language Learning, NeMLaP/CoNLL 1998 (pp. 111–120). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1603899.1603919

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free