A filtering approach for Alignment-Free Biosequences comparison with Mismatches

4Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Alignment-free approaches for sequence similarity based on substring composition are increasingly attracting interest from the scientific community. In fact, in several contexts, with respect to alignmentbased approaches, alignment-free techniques are faster but less accurate. Recently, several studies (e.g. [4,8,9]) attempted to bridge the accuracy gap with the introduction of approximate matches in the definition of composition-based distance measures. In this work we present MissMax, an exact algorithm for the computation of the longest common substring with mismatches between each suffix of a sequence x and a sequence y. This collection of statistics is useful for the computation of two similarity distances that have been recently extended to incorporate approximate matching, namely the longest and the average common substring with k mismatches. Our approach is exact, and it is based on a filtering technique that showed, in a set of preliminary experiments, to substantially reduce the size of the set of potential sites of a longest match.

Cite

CITATION STYLE

APA

Pizzi, C. (2015). A filtering approach for Alignment-Free Biosequences comparison with Mismatches. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9289, pp. 231–242). Springer Verlag. https://doi.org/10.1007/978-3-662-48221-6_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free