From Constituency to UD-Style Dependency: Building the First Conversion Tool of Turkish

Asli Kuzgun; Oguz Kerem Yildiz; Neslihan Cesur; Büsra Marsan; Arife Betül Yenice; Ezgi Saniyar; Oguzhan Kuyrukçu; Bilge Nas Arican; Olcay Taner Yildiz

Conference ProceedingsOPEN ACCESS

From Constituency to UD-Style Dependency: Building the First Conversion Tool of Turkish

International Conference Recent Advances in Natural Language Processing, RANLP (2021) 761-769

DOI: 10.26615/978-954-452-072-4_087

1Citations

40Readers

Get full text

Abstract

This paper deliberates on the process of building the first constituency-to-dependency conversion tool of Turkish. The starting point of this work is a previous study in which 10,000 phrase structure trees were manually transformed into Turkish from the original Penn Treebank corpus. Within the scope of this project, these Turkish phrase structure trees were automatically converted into UD-style dependency structures, using both a rule-based algorithm and a machine learning algorithm specific to the requirements of the Turkish language. The results of both algorithms were compared and the machine learning approach proved to be more accurate than the rule-based algorithm. The output was revised by a team of linguists. The refined versions were taken as gold standard annotations for the evaluation of the algorithms. In addition to its contribution to the UD Project with a large dataset of 10,000 Turkish dependency trees, this project also fulfills the important gap of a Turkish conversion tool, enabling the quick compilation of dependency corpora which can be used for the training of better dependency parsers.

Cite

CITATION STYLE

APA

Kuzgun, A., Yildiz, O. K., Cesur, N., Marsan, B., Yenice, A. B., Saniyar, E., … Yildiz, O. T. (2021). From Constituency to UD-Style Dependency: Building the First Conversion Tool of Turkish. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 761–769). Incoma Ltd. https://doi.org/10.26615/978-954-452-072-4_087

From Constituency to UD-Style Dependency: Building the First Conversion Tool of Turkish

Abstract

Cite

Register to see more suggestions