TermeX: A tool for collocation extraction

12Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Collocations - word combinations occurring together more often than by chance - have a wide range of NLP applications. Many approaches for automating collocation extraction based on lexical association measures have been proposed in the literature. This paper presents TermeX - a tool for efficient extraction of collocations based on a variety of association measures. TermeX implements POS filtering and lemmatization, and is capable of extracting collocations up to length four. We address trade-offs between high memory consumption and processing speed and propose an efficient implementation. Our implementation allows for processing time linear to corpus size and memory consumption linear to the number of word types. © Springer-Verlag Berlin Heidelberg 2009.

Cite

CITATION STYLE

APA

Delač, D., Krleža, Z., Šnajder, J., Bašić, B. D., & Šarić, F. (2009). TermeX: A tool for collocation extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5449 LNCS, pp. 149–157). https://doi.org/10.1007/978-3-642-00382-0_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free