Abstract
Similarity measurements are elemental concepts in text mining and information retrieval that helps us to quantify the similarity between documents, which is effective in the improvement of the performance of search engines and browsing techniques. Nowadays, varieties of similarity measures are available, but it is not clear that which similarity measure is more effective in finding the similarity of text documents. The aim of this paper is to provide a comparative analysis of various term based similarity measures such as Cosine similarity, Jaccard and Dice coefficient in order to evaluate the performance of this similarity measures in finding the similarity of two text documents.
Cite
CITATION STYLE
Afzali, M., & Kumar, S. (2017). Comparative Analysis of Various Similarity Measures for Finding Similarity of Two Documents. International Journal of Database Theory and Application, 10(2), 23–30. https://doi.org/10.14257/ijdta.2017.10.2.02
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.