Multilingual Universal Dependency Parsing from Raw Text with Low-resource Language Enhancement

8Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

This paper describes the system of our team Phoenix for participating CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Given the annotated gold standard data in CoNLL-U format, we train the tokenizer, tagger and parser separately for each treebank based on an open source pipeline tool UDPipe. Our system reads the plain texts for input, performs the preprocessing steps (tokenization, lemmas, morphology) and finally outputs the syntactic dependencies. For the low-resource languages with no training data, we use cross-lingual techniques to build models with some close languages instead. In the official evaluation, our system achieves the macro-averaged scores of 65.61%, 52.26%, 55.71% for LAS, MLAS and BLEX respectively.

Cite

CITATION STYLE

APA

Wu, Y., Zhao, H., & Tong, J. J. (2018). Multilingual Universal Dependency Parsing from Raw Text with Low-resource Language Enhancement. In CoNLL 2018 - SIGNLL Conference on Computational Natural Language Learning, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (pp. 74–80). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/K18-2007

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free