This paper introduces Hidden Markov Models with N-gram observation based on words bound morphemes (affixes) used in natural language text processing focusing on the field of syntactic classification. In general, presented curtailment of the consecutive gram’s affixes, decreases the accuracy in observation, but reveals statistically significant dependencies. Hence, considerably smaller size of the training data set is required. Therefore, the impact of affix observation on the knowledge generalization and associated with this improved word mapping is also described. The focal point of this paper is the evaluation of the HMM in the field of syntactic analysis for English and Polish language based on Penn and Składnica treebank. In total, a 10 HMM differing in the structure of observation has been compared. The experimental results show the advantages of particular configuration.
CITATION STYLE
Pietras, M. (2017). Hidden Markov models with affix based observation in the field of syntactic analysis. In Advances in Intelligent Systems and Computing (Vol. 534, pp. 17–26). Springer Verlag. https://doi.org/10.1007/978-3-319-48429-7_2
Mendeley helps you to discover research relevant for your work.