String matching with metric trees using an approximate distance

Ilaria Bartolini; Paolo Ciaccia; Marco Patella

Conference Proceedings

String matching with metric trees using an approximate distance

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2476 271-283

DOI: 10.1007/3-540-45735-6_24

38Citations

29Readers

Get full text

Abstract

Searching in a large data set those strings that are more similar, according to the edit distance, to a given one is a time-consuming process. In this paper we investigate the performance of metric trees, namely the M-tree, when they are extended using a cheap approximate distance function as a filter to quickly discard irrelevant strings. Using the bag distance as an approximation of the edit distance, we show an improvement in performance up to 90% with respect to the basic case. This, along with the fact that our solution is independent on both the distance used in the pre-test and on the underlying metric index, demonstrates that metric indices are a powerful solution, not only for many modern application areas, as multimedia, data mining and pattern recognition, but also for the string matching problem.

Cite

CITATION STYLE

APA

Bartolini, I., Ciaccia, P., & Patella, M. (2002). String matching with metric trees using an approximate distance. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2476, pp. 271–283). Springer Verlag. https://doi.org/10.1007/3-540-45735-6_24

String matching with metric trees using an approximate distance

Abstract

Cite

Register to see more suggestions