The paper describes a method of identifying a set of interesting constructions in a syntactically annotated corpus of Czech - the Prague Dependency Treebank - by application of an automatic procedure of analysis by reduction to the trees in the treebank. The procedure clearly reveals certain linguistic phenomena that go beyond 'dependency nature' (and thus generally pose a problem for dependency-based formalisms). Moreover, it provides a feedback indicating that the annotation of a particular phenomenon might be inconsistent. The paper contains discussion and analysis of individual phenomena, as well as the quantification of results of the automatic procedure on a subset of the treebank. The results show that a vast majority of sentences from the subset used in these experiments can be analyzed automatically and it confirms that most of the problematic phenomena belong to the language periphery. © Springer-Verlag 2013.
CITATION STYLE
Kuboň, V., Lopatková, M., & Mírovský, J. (2013). Automatic processing of linguistic data as a feedback for linguistic theory. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8265 LNAI, pp. 252–264). https://doi.org/10.1007/978-3-642-45114-0_20
Mendeley helps you to discover research relevant for your work.