This article describes a heuristic approach to zero subject detection in Polish. It focuses on the zero subject detection as a crucial step in end-to-end coreference resolution. The zero subject verbs are recognized using a set of manually created rules utilizing information from different sources, including: a dependency parser, a shallow relational parser and a valence dictionary. The rules were developed and evaluated on the Polish Coreference Corpus. The experimental results show that the presented method significantly outperforms the only machine learning-based alternative for Polish, i.e., MentionDetector. We also discuss and evaluate the importance of zero subject detection for existing coreference resolution tools for Polish.
CITATION STYLE
Kaczmarek, A., & Marcińczuk, M. (2015). Heuristic algorithm for zero subject detection in Polish. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9302, pp. 378–386). Springer Verlag. https://doi.org/10.1007/978-3-319-24033-6_43
Mendeley helps you to discover research relevant for your work.