k-factor-based cosine similarity measurement

Nadia Siddiqui; Saiful Islam

Conference Proceedings

k-factor-based cosine similarity measurement

Smart Innovation, Systems and Technologies (2019) 107 643-650

DOI: 10.1007/978-981-13-1747-7_63

3Citations

1Readers

Get full text

Abstract

With rapid increase in volume of text documents over Internet, the collections of document in digitilized form piling up every day. There is a need of effective and efficient measurement techniques in order to deal with such an enormous amount of data. Text similarity measurement is one of the most important metrics for proper understanding of documents in information retrieval and text mining problem. The conventional similarity measurement and cosine similarity measurement are based on Euclidean distance (L2 norm) that give better results only for lower dimensions of data. Hellinger distance-based similarity measurement (L1 norm) having constant value of metric works well only for some data mining applications. In this paper, we proposed a new similarity measurement based on Lk metric. It incorporated with existing L1 metric-based cosine similarity measurement with decreasing values of k as (0.49, 0.48, and 0.47) that depicts relative contrast of distances to query point. Performance evaluation shows that the proposed method is indeed effective as compared to existing one and also suitable for the query-based search for assigning a rank to the documents with respect to the query document.

Author supplied keywords

Cite

CITATION STYLE

APA

Siddiqui, N., & Islam, S. (2019). k-factor-based cosine similarity measurement. In Smart Innovation, Systems and Technologies (Vol. 107, pp. 643–650). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-13-1747-7_63

k-factor-based cosine similarity measurement

Abstract

Author supplied keywords

Cite

Register to see more suggestions