Similarity matching of pairs of text using CACT algorithm

Ch Naga Santhosh Kumar; V. Pavan Kumar; K. S. Reddy

Journal ArticleOPEN ACCESS

Similarity matching of pairs of text using CACT algorithm

International Journal of Engineering and Advanced Technology (2019) 8(6) 2296-2298

DOI: 10.35940/ijeat.F8685.088619

13Citations

16Readers

Get full text

Abstract

In data mining, shorter text analysis is performed more widely for many applications. Based on the syntax of the language, it is very difficult to analyze the short text with several traditional tools of natural language processing and this is not applied correctly either. In short text, it is known that there are rare and insufficient data available and further it is difficult to identify semantic knowledge with the great noise and ambiguity of short texts. In this paper, the authors proposed to replace the coefficient of similarity of Cosine with the measure of similarity of Jaro-Winkler to obtain the coincidence of similarity between pairs of text (source text and target text). Jaro-Winkler does a better job of determining the similarity of the strings because it takes an order into account when using the positional indices to estimate relevance. It is presumed that the performance of CACT driven by Jaro-Wrinkler with respect to one-to-many data links offers optimized performance when compared to the operation of CACT driven by cosine. In this paper, the ensemble algorithm CACTS and SAE is adopted with Jaro-Winkler similarity approach. The new algorithm is employed for short text analysis and better results. An evaluation of our proposed concept is sufficient as validation.

Author supplied keywords

Cite

CITATION STYLE

APA

Santhosh Kumar, C. N., Pavan Kumar, V., & Reddy, K. S. (2019). Similarity matching of pairs of text using CACT algorithm. International Journal of Engineering and Advanced Technology, 8(6), 2296–2298. https://doi.org/10.35940/ijeat.F8685.088619

Similarity matching of pairs of text using CACT algorithm

Abstract

Author supplied keywords

Cite

Register to see more suggestions