Constructing a turkish constituency parse treeBank

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we describe our initial efforts for creating a Turkish constituency parse treebank by utilizing the English Penn Treebank. We employ a semiautomated approach for annotation. In our previouswork [18], the English parse trees were manually translated to Turkish. In this paper, the words are semi-automatically annotated morphologically. As a second step, a rule-based approach is used for refining the parse trees based on the morphological analyses of the words. We generated Turkish phrase structure trees for 5143 sentences from Penn Treebank that contain fewer than 15 tokens. The annotated corpus can be used in statistical natural language processing studies for developing tools such as constituency parsers and statistical machine translation systems for Turkish.

Cite

CITATION STYLE

APA

Yıldız, O. T., Solak, E., Çandır, Ş., Ehsani, R., & Görgün, O. (2016). Constructing a turkish constituency parse treeBank. In Lecture Notes in Electrical Engineering (Vol. 363, pp. 339–347). Springer Verlag. https://doi.org/10.1007/978-3-319-22635-4_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free