Comparison of collocation extraction measures for document indexing

16Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

Automatic extraction of collocations from a corpus is a well-known problem in the field of natural language processing. It is typically carried out by employing some kind of a statistical measure that indicates whether or not two words occur together more often than by chance. As there is an aboundance of these measures proposed by various authors, we have compared some of them on a task of extracting collocations from a corpus of Croatian legal documents for the purpose of document indexing. We propose and evaluate extensions of these measures for collocations consisting of three words.

Cite

CITATION STYLE

APA

Petrovic, S., Snajder, J., Basic, B. D., & Kolar, M. (2006). Comparison of collocation extraction measures for document indexing. Journal of Computing and Information Technology, 14(4), 321–327. https://doi.org/10.2498/cit.2006.04.08

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free