Spelling error patterns in Brazilian Portuguese

8Citations
Citations of this article
74Readers
Mendeley users who have this article in their library.

Abstract

Fifty years after Damerau set up his statistics for the distribution of errors in typed texts, his findings are still used in a range of different languages. Because these statistics were derived from texts in English, the question of whether they actually apply to other languages has been raised. We address this issue through the analysis of a set of typed texts in Brazilian Portuguese, deriving statistics tailored to this language. Results show that diacritical marks play a major role, as indicated by the frequency of mistakes involving them, thereby rendering Damerau's original findings mostly unfit for spelling correction systems, although still holding them useful, should one set aside such marks. Furthermore, a comparison between these results and those published for Spanish show no statistically significant differences between both languages—an indication that the distribution of spelling errors depends on the adopted character set rather than the language itself.

Cite

CITATION STYLE

APA

Gimenes, P. A., & Carvalho, A. M. B. R. (2015). Spelling error patterns in Brazilian Portuguese. Computational Linguistics, 41(1), 175–184. https://doi.org/10.1162/COLI_a_00216

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free