Abstract
In this paper, we describe a method based on statistical machine translation (SMT) that is able to restore accents in Hungarian texts with high accuracy. Due to the ag-glutination in Hungarian, there are always plenty of word forms unknown to a sys-tem trained on a fixed vocabulary. In or-der to be able to handle such words, we integrated a morphological analyzer into the system that can suggest accented word candidates for unknown words. We evalu-ated the system in different setups, achiev-ing an accuracy above 99% at the highest.
Cite
CITATION STYLE
Zayyan, A. A., Elmahdy, M., … Al Ja’am, J. (2016). Automatic Diacritics Restoration for Dialectal Arabic Text. International Journal of Computing and Information Sciences, 12(2), 159–165. https://doi.org/10.21700/ijcis.2016.119
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.