A filtering approach for Alignment-Free Biosequences comparison with Mismatches

Cinzia Pizzi

Conference Proceedings

A filtering approach for Alignment-Free Biosequences comparison with Mismatches

Pizzi C

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9289 231-242

DOI: 10.1007/978-3-662-48221-6_17

4Citations

1Readers

Get full text

Abstract

Alignment-free approaches for sequence similarity based on substring composition are increasingly attracting interest from the scientific community. In fact, in several contexts, with respect to alignmentbased approaches, alignment-free techniques are faster but less accurate. Recently, several studies (e.g. [4,8,9]) attempted to bridge the accuracy gap with the introduction of approximate matches in the definition of composition-based distance measures. In this work we present MissMax, an exact algorithm for the computation of the longest common substring with mismatches between each suffix of a sequence x and a sequence y. This collection of statistics is useful for the computation of two similarity distances that have been recently extended to incorporate approximate matching, namely the longest and the average common substring with k mismatches. Our approach is exact, and it is based on a filtering technique that showed, in a set of preliminary experiments, to substantially reduce the size of the set of potential sites of a longest match.

Cite

CITATION STYLE

APA

Pizzi, C. (2015). A filtering approach for Alignment-Free Biosequences comparison with Mismatches. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9289, pp. 231–242). Springer Verlag. https://doi.org/10.1007/978-3-662-48221-6_17

A filtering approach for Alignment-Free Biosequences comparison with Mismatches

Abstract

Cite

Register to see more suggestions