Sentence clustering using continuous vector space representation

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we present a clustering approach based on the combined use of a continuous vector space representation of sentences and the k-means algorithm. The principal motivation of this proposal is to split a big heterogeneous corpus into clusters of similar sentences. We use the word2vec toolkit for obtaining the representation of a given word as a continuous vector space. We provide empirical evidence for proving that the use of our technique can lead to better clusters, in terms of intra-cluster perplexity and F1 score.

Cite

CITATION STYLE

APA

Chinea-Rios, M., Sanchis-Trilles, G., & Casacuberta, F. (2015). Sentence clustering using continuous vector space representation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9117, pp. 432–440). Springer Verlag. https://doi.org/10.1007/978-3-319-19390-8_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free