Fast phonetic similarity search over large repositories

5Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Analysis of unstructured data may be inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text, with a supporting dictionary. However, they are not rich enough to encode phonetic information to assist the search. In this paper, we present a novel approach for efficiently perform phonetic similarity search over large data sources, that uses a data structure called PhoneticMap to encode language-specific phonetic information. We validate our approach through an experiment over a data set using a Portuguese variant of a well-known repository, to automatically correct words with spelling errors. © 2014 Springer International Publishing Switzerland.

Cite

CITATION STYLE

APA

Tissot, H., Peschl, G., & Del Fabro, M. D. (2014). Fast phonetic similarity search over large repositories. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8645 LNCS, pp. 74–81). Springer Verlag. https://doi.org/10.1007/978-3-319-10085-2_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free