Identification of similar documents using coherent chunks

Sobha Lalitha Devi; Sankar Kuppan; Kavitha Venkataswamy; Pattabhi R.K. Rao

Conference Proceedings

Identification of similar documents using coherent chunks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5847 LNAI 54-68

DOI: 10.1007/978-3-642-04975-0_5

1Citations

4Readers

Get full text

Abstract

We focus on automatically finding similar documents using coherent chunks. The similarity between the documents is determined by identifying the coherent chunks present in them. We apply linguistic rules in identifying the coherent chunks and uses Vector Space Model (VSM) in determining the similarity among documents. We have taken patent documents from USPTO for this work. This method of using coherent chunks for identifying similar documents has shown encouraging results. © 2009 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Lalitha Devi, S., Kuppan, S., Venkataswamy, K., & Rao, P. R. K. (2009). Identification of similar documents using coherent chunks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5847 LNAI, pp. 54–68). https://doi.org/10.1007/978-3-642-04975-0_5

Identification of similar documents using coherent chunks

Abstract

Author supplied keywords

Cite

Register to see more suggestions