The retrieval information models have been of important study since 1992. These models are based on comparing a user query and a collection of documents taking into account the concurrency of the terms, with the objective to classify a set of relevant documents and retrieve them to the user in accordance with the evaluations criterion. There are metrics to classify a set of documents according to the grade of similarity, such as cosine similarity and soft cosine measure. In this paper, we perform a comparative study of these similarity metrics. The Vector Space Model (VSM) was implemented for retrieving information. A sample of the Collection of the Association for Computing Machinery (CACM) in the domain of Computer Science was used in the evaluation. The experiment results show that the recall is of 96% in both metrics, but the soft cosine achieves 2% more in mean average precision.
CITATION STYLE
Barbosa, J. J. G., Solís, J. F., David Terán-Villanueva, J., Valdés, G. C., Florencia-Juárez, R., González, L. J. H., & Mojica Mata, M. B. (2017). Implementation of an information retrieval system using the soft cosine measure. In Studies in Computational Intelligence (Vol. 667, pp. 757–766). Springer Verlag. https://doi.org/10.1007/978-3-319-47054-2_50
Mendeley helps you to discover research relevant for your work.