Building clusters of related words: An unsupervised approach

P. Deepak; Delip Rao; Deepak Khemani

Conference Proceedings

Building clusters of related words: An unsupervised approach

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4099 LNAI 474-483

DOI: 10.1007/978-3-540-36668-3_51

5Citations

3Readers

Get full text

Abstract

The task of finding semantically related words from a text corpus has applications in - to name a few - lexicon induction, word sense disambiguation and information retrieval. The text data in real world, say from the World Wide Web, need not be grammatical. Hence methods relying on parsing or part-of-speech tagging will not perform well in these applications. Further even if the text is grammatically correct, for large corpora, these methods may not scale well. The task of building semantically related sets of words from a corpus of documents and allied problems have been studied extensively in the literature. Most of these techniques rely on the usage of part-of-speech or parse information. In this paper, we explore a less expensive method for finding semantically related words from a corpus without parsing or part-of-speech tagging to address the above problems. This work focuses on building sets of semantically related words from a corpus of documents using traditional data clustering techniques. We examine some key results and possible applications of this work. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Deepak, P., Rao, D., & Khemani, D. (2006). Building clusters of related words: An unsupervised approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4099 LNAI, pp. 474–483). Springer Verlag. https://doi.org/10.1007/978-3-540-36668-3_51

Building clusters of related words: An unsupervised approach

Abstract

Cite

Register to see more suggestions