k-factor-based cosine similarity measurement

3Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With rapid increase in volume of text documents over Internet, the collections of document in digitilized form piling up every day. There is a need of effective and efficient measurement techniques in order to deal with such an enormous amount of data. Text similarity measurement is one of the most important metrics for proper understanding of documents in information retrieval and text mining problem. The conventional similarity measurement and cosine similarity measurement are based on Euclidean distance (L2 norm) that give better results only for lower dimensions of data. Hellinger distance-based similarity measurement (L1 norm) having constant value of metric works well only for some data mining applications. In this paper, we proposed a new similarity measurement based on Lk metric. It incorporated with existing L1 metric-based cosine similarity measurement with decreasing values of k as (0.49, 0.48, and 0.47) that depicts relative contrast of distances to query point. Performance evaluation shows that the proposed method is indeed effective as compared to existing one and also suitable for the query-based search for assigning a rank to the documents with respect to the query document.

Cite

CITATION STYLE

APA

Siddiqui, N., & Islam, S. (2019). k-factor-based cosine similarity measurement. In Smart Innovation, Systems and Technologies (Vol. 107, pp. 643–650). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-13-1747-7_63

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free