Building clusters of related words: An unsupervised approach

5Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The task of finding semantically related words from a text corpus has applications in - to name a few - lexicon induction, word sense disambiguation and information retrieval. The text data in real world, say from the World Wide Web, need not be grammatical. Hence methods relying on parsing or part-of-speech tagging will not perform well in these applications. Further even if the text is grammatically correct, for large corpora, these methods may not scale well. The task of building semantically related sets of words from a corpus of documents and allied problems have been studied extensively in the literature. Most of these techniques rely on the usage of part-of-speech or parse information. In this paper, we explore a less expensive method for finding semantically related words from a corpus without parsing or part-of-speech tagging to address the above problems. This work focuses on building sets of semantically related words from a corpus of documents using traditional data clustering techniques. We examine some key results and possible applications of this work. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Deepak, P., Rao, D., & Khemani, D. (2006). Building clusters of related words: An unsupervised approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4099 LNAI, pp. 474–483). Springer Verlag. https://doi.org/10.1007/978-3-540-36668-3_51

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free