The aim of the paper is to study the effect of pre-annotated clause boundaries on dependency parsing of Estonian new media texts. Our hypothesis is that correct identification of clause boundaries helps to improve parsing because as the text is split into smaller syntactically meaningful units, it should be easier for the parser to determine the syntactic structure of a given unit. To test the hypothesis, we performed two experiments on a 14,000-word corpus of Estonian web texts whose morphological analysis had been manually validated. In the first experiment, the corpus with gold standard morphological tags was parsed with MaltParser both with and without the manually annotated clause boundaries. In the second experiment, only the segmentation of the text was preserved and the morphological analysis was done automatically before parsing. The experiments confirmed our hypothesis about the influence of correct clause boundaries by a small margin: in both experiments, the improvement of LAS was 0.6%.
CITATION STYLE
Särg, D., Muischnek, K., & Müürisep, K. (2018). Annotated clause boundaries’ influence on parsing results. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11107 LNAI, pp. 171–179). Springer Verlag. https://doi.org/10.1007/978-3-030-00794-2_18
Mendeley helps you to discover research relevant for your work.