Annotation and extraction of multiword expressions in Turkish treebanks

Gülşen Eryiǧit; Kübra Adalı; Dilara Torunoglu-Selamet; Umut Sulubacak; Tugba Pamay

Conference ProceedingsOPEN ACCESS

Annotation and extraction of multiword expressions in Turkish treebanks

11th Workshop on Multiword Expressions, MWE 2015 - in conjunction with the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2015 (2015) 70-76

DOI: 10.3115/v1/w15-0912

8Citations

76Readers

Abstract

Multiword expressions (MWEs) present particular and distinctive semantic properties, hence their automatic extraction receives special attention from the natural language processing (NLP) and corpus linguistics community, and is still an active research area. Unfortunately, the creation of necessary resources for this task is quite rigorous and many languages suffer from the lack of these; as in the case for Turkish. This study presents our MWE annotations on recently introduced Turkish Treebanks, which focuses on annotating various types of linguistic units and expressions, including named entities, numerical expressions, idiomatic phrases, verb phrases with auxiliaries and duplications. The paper aims to provide a benchmark and pave the way towards further MWE extraction research for Turkish. To this end, the paper also introduces our experimental results with seven baseline approaches, a dependency parser and a previously introduced rule-based extractor on these annotated corpora. Our highest performances achieved over these resources are about 60% F-scores.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Eryiǧit, G., Adalı, K., Torunoglu-Selamet, D., Sulubacak, U., & Pamay, T. (2015). Annotation and extraction of multiword expressions in Turkish treebanks. In 11th Workshop on Multiword Expressions, MWE 2015 - in conjunction with the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2015 (pp. 70–76). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w15-0912

Readers' Seniority

PhD / Post grad / Masters / Doc 20

56%

Lecturer / Post doc 7

19%

Researcher 7

19%

Professor / Associate Prof. 2

Readers' Discipline

Computer Science 29

71%

Linguistics 9

22%

Agricultural and Biological Sciences 2

Neuroscience 1

Annotation and extraction of multiword expressions in Turkish treebanks

Abstract

References Powered by Scopus

Multiword expressions: A pain in the neck for NLP

Dependency parsing of Turkish

Strategies for contiguous multiword expression analysis and dependency parsing

Cited by Powered by Scopus

PARSEME multilingual corpus of verbal multiword expressions

Implementing universal dependency, morphology, and multiword expression annotation standards for Turkish language processing

Advanced machine learning techniques in natural language processing for Indian languages

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline