Similarity word-sequence kernels for sentence clustering

3Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper, we present a novel clustering approach based on the use of kernels as similarity functions and the C-means algorithm. Several word-sequence kernels are defined and extended to verify the properties of similarity functions. Afterwards, these monolingual word-sequence kernels are extended to bilingual word-sequence kernels, and applied to the task of monolingual and bilingual sentence clustering. The motivation of this proposal is to group similar sentences into clusters so that specialised models can be trained for each cluster, with the purpose of reducing in this way both the size and complexity of the initial task. We provide empirical evidence for proving that the use of bilingual kernels can lead to better clusters, in terms of intra-cluster perplexities. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Andrés-Ferrer, J., Sanchis-Trilles, G., & Casacuberta, F. (2010). Similarity word-sequence kernels for sentence clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6218 LNCS, pp. 610–619). https://doi.org/10.1007/978-3-642-14980-1_60

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free