Exploring Methods and Resources for Discriminating Similar Languages

13Citations
Citations of this article
68Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The Discriminating between Similar Languages (DSL) shared task at VarDial challenged participants to build an automatic language identification system to discriminate between 13 languages in 6 groups of highly-similar languages (or national varieties of the same language). In this paper, we describe the submissions made by team UniMelb-NLP, which took part in both the closed and open categories. We present the text representations and modeling techniques used, including cross-lingual POS tagging as well as fine-grained tags extracted from a deep grammar of English, and discuss additional data we collected for the open submissions, utilizing custombuilt web corpora based on top-level domains as well as existing corpora.

Cite

CITATION STYLE

APA

Lui, M., Letcher, N., Adams, O., Duong, L., Cook, P., & Baldwin, T. (2014). Exploring Methods and Resources for Discriminating Similar Languages. In 1st Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, VarDial 2014 at the 25th International Conference on Computational Linguistics: System Demonstrations, COLING 2014 - Proceedings (pp. 129–138). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-5315

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free