This paper addresses the problem of grapheme to phoneme conversion to create a pronunciation dictionary from a vocabulary of the most frequent words in European Portuguese. A system based on a mixed approach funded on a stochastic model with embedded rules for stressed vowel assignment is described. The implemented model can generate pronunciations from unrestricted words; however, a dictionary with the 40k most frequent words was constructed and corrected interactively. The dictionary includes homographs with multiplepronunciations. The vocabulary was defined using the CETEMPúblico corpus. The model and dictionary are publicly available. © 2012 The Brazilian Computer Society.
CITATION STYLE
Veiga, A., Candeias, S., & Perdigão, F. (2013). Generating a pronunciation dictionary for European Portuguese using a joint-sequence model with embedded stress assignment. Journal of the Brazilian Computer Society, 19(2), 127–134. https://doi.org/10.1007/s13173-012-0088-0
Mendeley helps you to discover research relevant for your work.