Identifying similar sentences by using N-Grams of characters

Saïma Sultana; Ismaïl Biskri

Conference Proceedings

Identifying similar sentences by using N-Grams of characters

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10868 LNAI 833-843

DOI: 10.1007/978-3-319-92058-0_80

2Citations

5Readers

Get full text

Abstract

Nowadays, detecting similar sentences can play a major role in various fundamental applications for reading and analyzing sentences like information retrieval, categorization, detection of paraphrases, summarizing, translation etc. In this work, we present a novel method for the detection of similar sentences. This method highlights the using of units of n-grams of characters. The online dictionary as well as any search engine are not being used. Hence, this idea leads our method a simplest and optimum way to handle the similarities between two sentences. In addition, the grammar rules as well as any syntax have not been used in our method. That’s why, our approach is language-independent. We analyze and compare a range of similarity measures with our methodology. Meanwhile, the complexity of our method is O(N2) which is pretty much better.

Author supplied keywords

Cite

CITATION STYLE

APA

Sultana, S., & Biskri, I. (2018). Identifying similar sentences by using N-Grams of characters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10868 LNAI, pp. 833–843). Springer Verlag. https://doi.org/10.1007/978-3-319-92058-0_80

Identifying similar sentences by using N-Grams of characters

Abstract

Author supplied keywords

Cite

Register to see more suggestions