A Novel Similarity Measure for Sequence Data

Mohammad. H. Pandi; Omid Kashefi; Behrouz Minaei

Journal ArticleOPEN ACCESS

A Novel Similarity Measure for Sequence Data

Pandi M
Kashefi O
Minaei B

Journal of Information Processing Systems (2011) 7(3) 413-424

DOI: 10.3745/jips.2011.7.3.413

N/ACitations

15Readers

Abstract

A variety of different metrics has been introduced to measure the similarity of two given sequences. These widely used metrics are ranging from spell correctors and categorizers to new sequence mining applications. Different metrics consider different aspects of sequences, but the essence of any sequence is extracted from the ordering of its elements. In this paper, we propose a novel sequence similarity measure that is based on all ordered pairs of one sequence and where a Hasse diagram is built in the other sequence. In contrast with existing approaches, the idea behind the proposed sequence similarity metric is to extract all ordering features to capture sequence properties. We designed a clustering problem to evaluate our sequence similarity metric. Experimental results showed the superiority of our proposed sequence similarity metric in maximizing the purity of clustering compared to metrics such as d2, Smith-Waterman, Levenshtein, and Needleman-Wunsch. The limitation of those methods originates from some neglected sequence features, which are considered in our proposed sequence similarity metric.

Cite

CITATION STYLE

APA

Pandi, Mohammad. H., Kashefi, O., & Minaei, B. (2011). A Novel Similarity Measure for Sequence Data. Journal of Information Processing Systems, 7(3), 413–424. https://doi.org/10.3745/jips.2011.7.3.413

A Novel Similarity Measure for Sequence Data

Abstract

Cite

Register to see more suggestions