MissMax: Alignment-free sequence comparison with mismatches through filtering and heuristics

23Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Measuring sequence similarity is central for many problems in bioinformatics. In several contexts alignment-free techniques based on exact occurrences of substrings are faster, but also less accurate, than alignment-based approaches. Recently, several studies attempted to bridge the accuracy gap with the introduction of approximate matches in the definition of composition-based similarity measures. Results: In this work we present MissMax, an exact algorithm for the computation of the longest common substring with mismatches between each suffix of a sequence x and a sequence y. This collection of statistics is useful for the computation of two similarity measures: the longest and the average common substring with k mismatches. As a further contribution we provide a "relaxed" version of MissMax that does not guarantee the exact solution, but it is faster in practice and still very precise.

Cite

CITATION STYLE

APA

Pizzi, C. (2016). MissMax: Alignment-free sequence comparison with mismatches through filtering and heuristics. Algorithms for Molecular Biology, 11(1). https://doi.org/10.1186/s13015-016-0072-x

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free