A Novel Similarity Measure for Sequence Data

  • Pandi M
  • Kashefi O
  • Minaei B
N/ACitations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

A variety of different metrics has been introduced to measure the similarity of two given sequences. These widely used metrics are ranging from spell correctors and categorizers to new sequence mining applications. Different metrics consider different aspects of sequences, but the essence of any sequence is extracted from the ordering of its elements. In this paper, we propose a novel sequence similarity measure that is based on all ordered pairs of one sequence and where a Hasse diagram is built in the other sequence. In contrast with existing approaches, the idea behind the proposed sequence similarity metric is to extract all ordering features to capture sequence properties. We designed a clustering problem to evaluate our sequence similarity metric. Experimental results showed the superiority of our proposed sequence similarity metric in maximizing the purity of clustering compared to metrics such as d2, Smith-Waterman, Levenshtein, and Needleman-Wunsch. The limitation of those methods originates from some neglected sequence features, which are considered in our proposed sequence similarity metric.

Cite

CITATION STYLE

APA

Pandi, Mohammad. H., Kashefi, O., & Minaei, B. (2011). A Novel Similarity Measure for Sequence Data. Journal of Information Processing Systems, 7(3), 413–424. https://doi.org/10.3745/jips.2011.7.3.413

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free